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RELATED APPLICATIONS 

This patent application claims the benefit of U.S. Provisional Application No. 
60/185,987, filed March 1, 2000, 60/263,473, filed January 23, 2001, 60/263,668 filed 
January 23, 2001, 60/263,424, filed January 23, 2001 and the U.S. Provisional 
10 Application for Henry Daniell entitled "Expression of the Native Cholera Toxin B 
Subunit Gene as Oligomers in Transgenic Tobacco Chloroplasts," filed February 23, 
200 1 . This patent application is also related to patent publication PCT/EB98/01 199, WO 
99/10513 Specific Aims; international publication date 4, March 1990. These earlier 
provisional applications and publications are hereby incorporated by reference. 

15 

TECHNICAL FIELD 

This invention relates to compositions and methods and products of 
Pharmaceutical Proteins, Human Therapeutics, Human Serum Albumin, Insulin, Native 
Cholera Toxic B Submitted On Transgenics Plastids, containing transformed plastids. 

20 

This invention relates to several embodiments which axe disclosed herein in 
several specifications and corresponding figures titled as Pharmaceutical Proteins, 
Human Therapeutics, Human Serum Albumin, Insulin, Native Cholera Toxic B 
Submitted On Transgenics Plastids presented as one patent application and a set of claims 
25 thereof. 

NON-OBVIOUS NATURE OF INVENTION 



Despite the potential advantages of chloroplasts for biophannaceutical production, it was 
30 not obvious that expressed pharmaceutical proteins in plastids would assemble in this organelle. 
Prior to this patent application there were no published reports of biophannaceutical proteins 
expression in chloroplasts. Indeed, there were valid reasons to suggest that such expression 
would be problematic Proinsulin contains both a and P chains and the C-peptide that connects 
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them. It is synthesized as a pre-proinsulin by the pancreas. After proper folding, disulfide 
bridges are established between a and p chains. NIH reviewers noted that chloroplasts 
expression of high levels of properly assembled proinsulin was unanticipated. Nor was it 
obvious, as pointed out by NIH reviewers, that proinsulin expressed within plastids would be 
5 fully functional. Prior to this invention, there was no report of expression of proinsulin within 
plastids. 

Similarly, prior to this invention, there was no report of expression of Human Serum 
Albiirnin within plastids. Human serum albumin is a globular protein of 66.5 kDa in size that 
should be properly folded and stabilized with seventeen disulfide bridges. Each human serum 

10 albumin consists of three structurally similar globular domains and the disulfides are positioned 
in repeated series of nine loop-link-loop structures centered around eight sequential Cys-Cys 
pairs. HSA is initially synthesized as pre-pro-albumin by the liver and released from the 
endoplasmatic reticulum after removal of the arninoterminal propeptide of 18 amino acids. The 
pro-albumin is further processed in the Golgi complex where the other 6 arninoterminal residues 

15 of the propeptide are cleaved by a serine proteinase. This results in the secretion of the mature 
polypeptide of 585 amino acids. It was likewise unanticipated that fully assembled HSA would 
be synthesized in large amount within plastids. 



20 enterotoxins, of which the cholera toxin (CT) is considered the main cause of toxicity. CT is a 
hexameric AB 5 protein having one 27KDa A subunit which has toxic ADP-ribosyl transferase 
activity and a non-toxic pentamer of 1 1.6 kDa B subunits that are non-covalently linked into a 
very stable doughnut like structure into which the toxic active (A) subunit is inserted. The A 
subunit of CT consists of two fragments - Al and A2 which are linked by a disulfide bond. The 

25 enzymatic activity of CT is located solely on the Al fragment. The A2 fragment of the A subunit 
links the Al fragment and the B pentamer. CT binds via specific interactions of the B-subunit 
pentamer with GM1 ganglioside, the membrane receptor, present on the intestinal epithelial cell 
surface of the host. The A subunit is then translocated into the cell where it ADP-ribosylates the 
Gs subunit of adenylate cyclase bringing about the increased levels of cyclic AMP in affected 

30 ceils that is associated with the electrolyte and fluid loss of clinical cholera. However, the B 
subunit, when administered orally, is a potent mucosal immunogen which can neutralize the 
toxicity of CT holotoxin by preventing its binding to intestinal cells. To achieve this effect, the 
B-subunit must be assembled in a pentameric form and disulfide bridges should be established 
among the subunit structures. It was not obvious that plastids could express and assemble 



Vibrio cholerae causes diarrhea by colonizing the small intestine and producing 
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peatameric CTB. There has been no prior report of expression of CTB or its assembly within 
plastids. 

There was no certainty that biophannaceuticals would assemble nprmally in chloroplasts 
5 or that they would retain their functionality in the prior art. Indeed, there might have been 

unforeseen deleterious effects of high-level expression of biopharmaceuticals in chloroplasts on 
plant growth or development that were not apparent from the experiences with other transgenes. 
The pH and oxidation state of the chloroplast differs from that of the human blood or bacterial 
cell in ways that might inhibit or prevent proper folding and assembly. 

10 There Eire examples of protein complexes in the chloroplast in which all the subunits are 

native to the plant, the ribosome being an example. However, the expression and assembly in 
transformed chloroplasts of heterologous proteins into multi-protein complexes has not been 
reported until the present invention. There is a single example in the literature of an inter-chain 
disulfide bond in plant chloroplasts, and that is between neighboring large subunits of the 

15 enzyme ribulose-1, 5-biphosphase carboxylase/oxygenase. The expression and assembly in 

transformed chloroplasts of functional proteins consisting of different protein chains, including 
disulfide bonds between different subunits, as represented by expression and assembly of a 
mammalian protein has never been demonstrated until the present invention. 
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PRODUCTION OF PHARMACEUTICAL PROTEINS 
IN TRANSGEN IC COT O ROPLASTS 



RELATED APPLICATION 

This patent application claims the benefit of U.S. Provisional Application No. 
60/185,987, filed March 1,2000. This earlier provisional application is hereby incorporated 
by reference. 

FIELD OF THE INVENTION 

This invention relates to the production of pharmaceutical proteins in transgenic 
chloroplasts. More particularly, this invention relates to the production of human insulin in 
tobacco plants. 



BACKGROUND 

Research efforts have been made to synthesize high value pharmacologically active 
recombinant proteins in plants. Recombinant proteins such as vaccines, monoclonal 
antibodies, hormones, growth factors, neuropeptides, cytotoxins, serum proteins and 
enzymes have been expressed in nuclear transgenic plants (May et al., 1996). It has been 
estimated that one tobacco plant should be able to produce more recombinant protein than 
a 300-liter fermenter of E. coli. In addition, a tobacco plant produces a million seeds, 
thereby facilitating large-scale production. Tobacco is also an ideal choice because of its 
relative ease of genetic manipulation and an impending need to explore alternate uses for 
this hazardous crop. 

A primary reason for the high cost of production via fermentation is the cost of carbon 
source co-substances as well as maintenance of a large fermentation facility. In contrast, 
most estimates of plant production are a thousand-fold less expensive than fermentation. 
Tissue specific expression of high value proteins in leaves can enable the use of crop plants 
as renewable resources. Harvesting the cobs, tubers, seeds or fruits for food and feed and 
leaves for value added products should result in further economy with no additional 
investment 
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However, one of the major limitations in producing pharmaceutical proteins in plants 
is their low level of foreign protein expression, despite reports of higher level expression of 
enzymes and certain proteins. May et al. (1998) discuss this problem using the following 
examples. Although plant derived recombinant hepatitis B surface antigen was as effective 

5 as a commercial recombinant vaccine, the levels of expression in transgenic tobacco were 
low (0.0 1 % of total soluble protein). Even though Norwalk virus capsid protein expressed 
in potatoes caused oral immunization when consumed as food (edible vaccine), expression 
levels were low (0.3% of total soluble protein). A synthetic gene coding for the human 
epidermal growth factor was expressed only up to 0.001% of total soluble protein in 

1 0 transgenic tobacco. Human serum albumin has been expressed only up to 0.02% of the total 
soluble protein in transgenic plants. 

Therefore, it is important to increase levels of expression of recombinant proteins in 
plants to exploit plant production of pharmacologically important proteins. An alternate 
approach is to express foreign proteins in chloroplasts of higher plant. Foreign genes (up to 

15 1 0,000 copies per cell) have been incorporated into the tobacco chloroplast genome resulting 
in accumulation of recombinant proteins up to 30% of the total cellular protein (McBride et 
al., 1994). 

The aforementioned approaches (except chloroplast transformation) are limited to 
eukaryotic gene expression because prokaryotic genes are expressed poorly in the nuclear 

20 compartment. However, several pharmacologically important proteins (such as insulin, 
human serum albumin, antibodies, enzymes etc.) are produced currently in E. coli. Also, 
several bacterial proteins (such as cholera toxin B subunit) are used as oral vaccines against 
diarrheal diseases. Therefore, it is important to develop a plant production system for 
expression of pharmacologically important proteins that are currently produced in 

25 prokaryotic systems (such as E. coli) via fermentation. 

Chloroplasts are prokaryotic compartments inside eukaryotic cells. Since the 
transcriptional and translational machinery of the chloroplast is similar to E. coli (Brixey et 
al., 1 997), it is possible to express prokaryotic genes at very high levels in plant chloroplasts 
than in the nucleus. La addition, plant cells contain up to 50,000 copies of the circular plastid 

30 genome (Bendich 1987) which may amplify the foreign gene like a "plasmid in the plant 
cell," thereby enabling higher levels of expression . Therefore, chloroplasts are an ideal 
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choice for expression of recombinant proteins that are currently expressed in E. coli (such 
as insulin, human serum albumin, vaccines, antibodies, etc.). We exploited the chloroplast 
transformation approach to express a pharmacological protein that is of no value to the plant 
to demonstrate this concept, GVGVP gene has been synthesized with a codon preferred for 
5 prokaryotic (EG121) or eukaryotic (TG131) expression. Based on transcript levels, 
chloroplast expression of this polymer was a hundred-fold higher than nuclear expression 
in transgenic plants (Guda et al., 1999). Recently, we observed 16.966-fold more tps 1 
transcripts in chloroplast transformants than the highly expressing nuclear transgenic plants 
(Lee et al. 2000, in review). 

10 

DETAILED DESCRIPTION 

In our research, we use insulin as a model protein to demonstrate its production as a 
value added trait in transgenic tobacco. Most importantly, a significant advantage in the 
production of pharmaceutical proteins in chloroplasts is their ability to process eukaryotic 

15 protein, including folding and formation of disulfide bridges (Dreshcher et al., 1998). 
Chaperonin proteins are present in chloroplasts (Verling 1991; Roy 1989) that Amotion in 
folding and assembly of prokaryotic/eukaryotic proteins. Also, proteins are activated by 
disulfide bond oxido/reduction cycles using the chloroplast inicredoxin system (Reulland and 
Miginiac-Maslow, 1999) or chloroplast protein disulfide isomerase (Kim and Mayfield, 

20 1997). Accumulation of fully assembled, disulfide bonded form of antibody inside 
chloroplasts, even though plastics were not transformed (During et al. 1990), provides strong 
evidence for successful assembly of proinsulin inside chloroplasts. Indeed, we observed 
fully assembled heavy and light chains of humanized Guy's 13 antibody in transgenic 
tobacco chloroplasts (Panchal et al. 2000, in review). Such folding and assembly eliminates 

25 the need for post-purification processing of pharmaceutical proteins . Chloroplasts may also 
be isolated from crude homogenates by centrifugation (1500 X g). This fraction is free of 
other cellular proteins. Isolated chloroplasts are burst open by osmotic shock to release 
foreign proteins that are compartmentalized in this organelle along with few other native 
soluble proteins (Daniel and McFadden, 1987). 

30 GVGVP is a PBP made from synthetic genes. At lower temperatures the polymers 

exist as more extended molecules which, on raising the temperature above the transition 




PCT/US01/06288 



WO 01/72959" 

7 



range, hydrophobically fold into dynamic structures called JS-spirals that further aggregate 
by hydrophobic association to form twisted filaments (Urry, 1991; Urry, et al., 1994). 
Inverse temperature transition offers several advantages. Expense associated with 
chromatographic resins and equipment are eliminated. It also facilitates scale up of 
5 purification from grams to kilograms. Milder purification conditions use only a modest 
change in temperature and ionic strength. This also facilitates higher recovery, faster 
purification and high volume processing. Protein purification is generally the slow step 
(bottleneck) in pharmaceutical product development. Through exploitation of this reversible 
inverse temperature transition property, simple and inexpensive extraction and purification 
10 is performed. The temperature at which the aggregation takes place can be manipulated by 
engineering biopolymers containing varying numbers of repeats and changing salt 
concentration in solution (McPherson et al., 1996). Chloroplast mediated expression of 
insulin-polymer fusion protein eliminates the need for the expensive fermentation process 
as well as reagents needed for recombinant protein purification and downstream processing. 

15 

Large-scale production of insulin in plants in conjunction with an oral delivery system 
is a powerful approach to provide insulin to diabetes patients at an affordable cost and 
provide tobacco farmers alternate uses for this hazardous crop. For example, Sun et al. 
(1 994) showed that feeding a small dose of antigens conjugated to the receptor binding non- 
20 toxic B subunit moiety of the cholera toxin (CTB) suppressed systemic T cell-mediated 
inflammatory reactions in animals. Oral administration of a myelin antigen conjugated to 
CTB has been shown to protect animals against encephalomyelitis, even when given after 
disease induction (Sun et al, 1996). Bergerot et al. (1997) reported that feeding small 
amounts of human insulin conjugated to CTB suppressed beta cell destruction and clinical 
25 diabetes in adult non-obese diabetic (NOD) mice. The protective effect could be transferred 
by T cells from CTB-insulin treated animals and was associated with reduced insulitis. 
These results demonstrate that protection against autoimmune diabetes can indeed be 
achieved by feeding small amounts of pancreas islet cell auto antigen linked to CTB 
(Bergerot, et al. 1997). Conjugation with CTB facilitates antigen delivery and presentation 
30 to the Gut Associated Lymphoid Tissues (QALT) due to its affinity for the cell surface 
receptor GM-ganglioside located on GALT cells, for increased uptake and immunologic 
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recognition (Arakawa et al. 1998). Transgenic potato tubers expressed up to 0.1% CTB- 
insulin fusion protein of total soluble protein, which retained GM-ganglioside binding 
affinity and native autogenicty for both CTB and insulin. NOD mice fed with transgenic 
potato tubers containing microgram quantities of CTB-insulin fusion protein showed a 
substantial reduction in insulitis and a delay in the progression of diabetes (Arkawa et al., . 
1998). However, for commercial exploitation, the levels of expression need to be increased 
in transgenic plants. Therefore, we undertook the expression of CTB-insulin fusion in 
transgenic chloroplasts of nicotine free edible tobacco to increase levels of expression 
adequate for animal testing. 

In accordance with one advantageous feature of this invention, we usepoly(GVGVP) 
as a fusion protein to enable hyper-expression of insulin and accomplish rapid one step 
purification of fusion peptides utilizing the inverse temperature transition properties of this 
polymer. In another advantageous feature of this invention, we develop insulin-CTB fusion 
protein for oral delivery in nicotine free edible tobacco (LAMD 605). Both features are 
accomplished as follows: 

a) Develop recombinant DNA vectors for enhanced expression of Proinsulin as fusion 
proteins with GVGVP or CTB via chloroplast genomes of tobacco, 

b) Obtain transgenic tobacco (Petit Havana & LAMD 605) plants, 

c) Characterize transgenic expression of proinsulin polymer or CTB fusion proteins 
using molecular and biochemical methods in chloroplasts, 

d) Employ existing or modified methods of polymer purification from transgenic leaves, 

e) Analyze Mendelian or maternal inheritance of transgenic plants, 

f) Large scale purification of insulin and comparison of current insulin purification 
methods with polymer-based purification method in E. coli and tobacco, 

g) Compare natural refolding chloroplasts with in vitro processing, 

h) Characterization (yield and purity) of proinsulin produced in E. coli and transgenic 
tobacco, and 

i) Assessment of diabetic symptoms in mice fed with edible tobacco expressing CTB- 
insulin fusion protein. 

Diabetes and Insulin: Insulin lowers blood glucose (Oakly et al. 1973). This is a result of 
its immediate effect in increasing glucose uptake in tissues. In muscle, under the action of 
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insulin, glucose is more readily taken up and either converted to glycogen and lactic acid or 
oxidized to carbon dioxide. Insulin also affects a number of important enzymes concerned 
with cellular metabolism. It increases the activity of glucokinase, which phosphoryiates 
glucose, thereby increasing the rate of glucose metabolism in the liver. Insulin also 
5 suppresses gluconeogenesis by depressing the function of liver enzymes, which operate the 
reverse pathway from proteins to glucose. Lack of insulin can restrict the transport of 
glucose into muscle and adipose tissue. This results in increases in blood glucose levels 
(hyperglycemia). In addition, the breakdown of natural fat to free fatty acids and glycerol 
is increased and there is a rise in the fatty acid content in the blood. Increased catabolism of 

10 fatty acids by the liver results in greater production of ketone bodies. They diffuse from the 
liver and pass to the muscles for further oxidation. Soon, ketone body production rate 
exceeds oxidation rate and ketosis results. Fewer amino acids are taken up by the tissues and 
protein degradation results. At the same time, gluconeogenesis is stimulated and protein is 
used to produce glucose. Obviously, lack of insulin has serious consequences. 

15 Diabetes is classified into types I and II. Type I is also known as insulin dependent 

diabetes mellitus (IDDM). Usually this is caused by a cell-mediated autoimmune destruction 
of the pancreatic p-cells (Davidson, 1 998). Those suffering from this type are dependent on 
external sources of insulin. Type II is known as noninsulin-dependent diabetes mellitus 
(NIDDM). This usually involved resistance to insulin in combination with its 

20 underproduction. These prominent diseases have led to extensive research into microbial 
production of recombinant human insulin (rHT). 

Expression of Recombinant Human Insulin in E. coli; In 1978, two thousand kilograms 
of insulin were used in the world each year; half of this was used in the United States 
(Steiner et al., 1978). At that time, the number of diabetics in the US were increasing 6% 

25 every year (Gunby, 1978). In 1 997 - 98, 10% increase in sales of diabetes care products and 
19% increase in insulin products have been reported by Novo Nordisk (world's leading 
supplier of insulin), making it a 7.8 billion dollar industry. Annually, 160,000 Americans 
are killed by diabetes, making it the fourth leading cause of death. Many methods of 
production of rHI have been developed. Insulin genes were first chemically synthesized for 

30 expression in Esherichia coli (Crea et al., 1978). These genes encoded separate insulin A 
and B chains. The genes were each expressed in E. coli as fusion proteins with the p- 
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galactosidase (Goeddel et al., 1979). The first documented production of rHI using this 
system was reported by David Goeddel from Genentech (Hall, 1988). For reasons explained 
later, the genes were fused to the Trp synthase gene. This fusion protein was approved for 
commercial production by Eli Lilly in 1982 (Chance and Frank, 1993) with a product name 
5 of Humulin. As of 1986, Humulin was produced from proinsulin genes. Proinsulin contains 
both insulin chains and the C-peptide that connects them. Data concerning commercial 
production of Humulin and other insulin products is now considered proprietary information 
and is not available to the public. 

Delivery of Human Insulin: Insulin has been delivered intravenously in the past several 

10 years. However, more recently, alternate methods such as nasal spray are also available. 
Oral delivery of insulin is yet another new approach (Mathiowitz et al. 5 1997). Engineered 
polymer microspheres made of biologically erodable polymers, which display strong 
interactions with gastrointestinal mucus and cellular linings, can traverse both mucosal 
absorptive epithelium and the follicle-associated epithelium, covering the lymphoid tissue 

15 of Peyer's patches. Polymers maintain contact with intestinal epithelium for extended 
periods of time and actually penetrate through and between cells. Animals fed with the 
poly(FA: PLGA)-encapsulated insulin preparation were able to regulate the glucose load 
better than controls, confirming that insulin crossed the intestinal barrier and was released 
from the microspheres in a biologically active form (Mathiowitz et al., 1997). 

20 Protein Based Polymers (PBP): The synthetic gene that codes for a bioelastic PBP was 
designed after repeated amino acid sequences GVGVP, observed in all sequenced 
mammalian elastin proteins (Yeh et al. 1987). Elastin is one of the strongest known natural 
fibers and is present in skin, ligaments, and arterial walls. Bioelastic PBPs containing 
multiple repeats of this pentamer have remarkable elastic properties, enabling several 

25 medical and non-medical applications (TJrryetaL 1993, Urry 1995,DanielI 1995). GVGVP 
polymers prevent adhesions following surgery, aid in reconstructing tissues and delivering 
drugs to the body over an extended period of time. North American Science Associates, Inc. 
reported that GVGVP polymer is non-toxic in mice, non-sensitizing and non-antigenic in 
guinea pigs, and non-pyrogenic in rabbits (Urry et al. 1993). Researchers have also observed 

30 that inserting sheets of GVGVP at the sites of contaminated wounds in rats reduces the 
number of adhesions that form as the wounds heal (Urry et al. 1993). In a similar manner, 
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using the GVGVP to encase muscles that are cut during eye surgery in rabbits prevents 
scarring following the operation (Urry et aL 1993, Urry 1995). Other medical applications 
of bioelastic PBPs include tissue reconstruction (synthetic ligaments and arteries, bones), 
wound coverings, artificial pericardia, catheters and programmed drug delivery (Urry, 1995; 
5 Unyetal., 1993, 1996). 

We have expressed the elastic PBP (GVGVP) l2 i in E. coli (Guda et al. 1995, Brixey 
et al. 1 997), in the fungus Aspergillus nidulans (Herzog et al. 1 997), in cultured tobacco cells 
(Zhang et al. 1995), and in transgenic tobacco plants (Zhang et al. 1996). In particular, 
(GVGVP) has been expressed to such high levels in E. coli that polymer inclusion bodies 

10 occupied up to about 90% of the cell volume. Also, inclusion bodies have been observed in 
chloroplasts of transgenic tobacco plants (see attached article, Daniell and Guda, 1997). 
Recently, we reported stable transformation of the tobacco chloroplasts by integration and 
expression the biopolymer gene (EG121), into the Large Single Copy region (5,000 copies 
per cell) or the Inverted Repeat region (10,000 copies per cell) of the chloroplast genome 

15 (GudaetaL, 1999). 

PBP as Fusion Proteins: Several systems are now available to simplify protein purification 
including the maltose binding protein (Marina et al. 1988), glutethione S-tranferase (Smith 
and Johnson 1988), biotinyiated (Tsao et al. 1996), thioredoxin (Smith et al. 1998) and 
cellulose binding (Ong et aL 1989) proteins. Recombinant DNA vectors for fusion with 

20 short peptides are now available to effectively utilize aforementioned fusion proteins in the 
purification process (Smith etal. 1998; Kim and Raines, 1993; Suetal. 1992). Recombinant 
proteins are generally purified by affinity chromatography, using ligands specific to carrier 
proteins (Nilsson et al. 1997). While these are useful techniques for laboratory scale 
purification, affinity chromatography for large-scale purification is time consuming and cost 

25 prohibitive. Therefore, economical and non-chromatographic techniques are highly 
desirable. In addition, a common solution to N-terminal degradation of small peptides is to 
fuse foreign peptides to endogenous E. coli proteins. Early in the development of this 
technique, jJ-galactosidase (j$-gal) was used as a fusion protein (Goldberg and Goff, 1986). 
A drawback of this method was that the P-gal protein is of relatively high molecular weight 

30 (MW 1 00,000). Therefore, the proportion of the peptide product in the total protein is low. 
Another problem associated with the large P-gal fusion is early termination of translation 
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(Burnette, 1983; Hall, 1988). This occurred when P-gal was used to produce human insulin 
peptides because the fusion was detached from the ribosome during translation thus yielding 
incomplete peptides. Other proteins of lower molecular weight proteins have been used as 
fusion proteins to increase the peptide production. For example, better yields were obtained . 
5 with the tryptophan synthase (190aa) fusion proteins (Hall, 1988; Burnett, 1983). 

Accordingly, one achievement according to this invention is to use poly(GVGVP) as 
a fusion protein to enable hyper-expression of insulin and accomplish rapid one step 
purification of the fusion peptide. At lower temperatures the polymers exist as more 
extended molecules which, on raising the temperature above the transition range, 

10 hydrophobically fold into dynamic structures called ^-spirals that further aggregate by 
hydrophobic associationto form twisted filaments (Urry, 1991), Through exploitation of this 
reversible property, simple and inexpensive extraction and purification is performed. The 
temperature at which aggregation takes place (T j) is manipulated by engineering biopolymers 
containing varying numbers of repeats or changing salt concentration (McPherson et al., 

15 1996). Another group has recently demonstrated purification of recombinant proteins by 
fusion with thermally responsive polypeptides (Meyer and Chilkoti, 1999). Polymers of 
differentsizeshavebeensynthesizedandexpressedin£. coli. This approach also eliminates 
the need for expensive reagents, equipment and time required for purification. 
Cholera Toxin p subunit as a fusion protein: Vibrio cholerae causes diarrhea by 

20 colonizing the small intestine and producing enterotoxins, of which the cholera toxin (CT) 
is considered the main cause of toxicity. CT is a hexameric AB S protein having one 27KDa 
A subunit which has toxic ADP-ribosyl transferase activity and a non-toxic pentamer of 1 1 .6 
kDa B subunits that are non-covalently linked into a very stable doughnut like structure into 
which the toxic active (A) subunit is inserted. The A subunit of CT consists of two 

25 fragments -Al and A2 which are linked by a disulfide bond. The enzymatic activity of CT 
is located solely on the A 1 fragment (Gill, 1976). The A2 fragment ofthe A subunit links the 
Al fragment and the B pentamer. CT binds via specific interactions of the B subunit 
pentamer with GM1 ganglioside, the membrane receptor, present on the intestinal epithelial 
cell surface of the host. The A subunit is then translocated into the cell where it ADP- 

30 ribosylates the Gs subunit of adenylate cyclase bringing about the increased levels of cyclic 
AMP in affected cells that is associated with the electrolyte and fluid loss of clinical cholera 
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(Lebens et al. 1994). For optimal enzymatic activity, the Al fragment needs to be separated 
from the A2 fragment by proteolytic cleavage of the main chain and by reduction of the 
disulfide bond linking them (Mekalanos et al., 1979). 

The Expression and assembly of CTB in transgenic potato tubers has been reported 

5 (Arakawa et al. 1997). The CTB gene including the leader peptide was fused to an 
endoplasmic reticulum retention signal (SEKDEL) at the 3' end to sequester the CTB protein 
within the lumen of the ER. The DNA fragment encoding the 2 1-amino acid leader peptide 
of the CTB protein was retained to direct the newly synthesized CTB protein into the lumen 
of the ER. Immunoblot analysis indicated that the plant derived CTB protein was 

0 antigenically indistinguishable from the bacterial CTB protein and that oligomeric CTB . 
molecules (Mr - 50 kDa) were the dominant molecular species isolated from transgenic 
potato leaf and tuber tissues. Similar to bacterial CTB, plant derived CTB dissociated into 
monomers (Mr- 15 kDa) during heat acid treatment. 

5 Enzyme linked immunosorbent assay methods indicated that plant synthesized CTB 

protein bound specifically to GM1 gangliosides, the natural membrane receptors of Cholera 
Toxin. The maximum amount of CTB protein detected in auxin induced transgenic potato 
leaf and tuber issues was approximately 0.3% of the total soluble protein. The oral 
immunization of CD-I mice with transgenic potato tissues transformed with the CTB gene 

0 (administered at weekly intervals for a month with a final booster feeding on day 65) has 
also been reported. The levels of serum and mucosal anti-cholera toxin antibodies in mice 
were found to generate protective immunity against the cytopatbic effects of CT holotoxin. 



Following intraileal injection with CT, the plant immunized mice showed up to a 60% 
reduction in diarrheal fluid accumulation in the small intestine. Systemic and mucosal CTB- 
specific antibody titers were determined in both serum and feces collected from immunized 
mice by the class-specific chemiluminescesnt ELISA method and the endpoint titers for the 
three antibody isotypes (IgM, IgG and IgA) were determined. 

The extent of CT neutralization in both Vero cell and ileal loop experiments suggested 
that anti-CTB antibodies prevent CT binding to cellular GM1 -gangliosides. Also, mice fed 
with 3 g of transgenic potato exhibited similar intestinal protection as mice gavaged with 30 
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g of bacterial CTB. Recombinant LTB [rLTB] (the heat labile enterotoxin produced by 
Enterotoxigenic E. eoli) which is structurally, functionally and immunologically similar to 
CTB was expressed in transgenic tobacco (Axntzen et al. 1998; Haq et al„ 1995). They have 
reported that the rLTB retained its antigenicity as shown by immunoprecipitation of rLTB 
with antibodies raised to rLTB from E. colt The rLTB protein was of the right molecular 
weight and aggregated to form the pentamer as confirmed by gel permeation 
chromatography. 

CTB has also been demonstrated to be an effective carrier molecule for induction of 
mucosal immunity to polypeptides to which it is chemically or genetically conjugated 
(McKenzie et al, 1984; Dertzbaugh et al, 1993). The production of immunomodulatory 
transmucosal carrier molecules, such as CTB, in plants may greatly improve the efficacy of 
edible plant vaccines (Haq et al, 1995; Thanavala et al, 1995; Mason et al, 1996) and may 
also provide novel oral tolerance agents for prevention of such autoimmune diseases as Type 
1 diabetes (Zhang et al, 1991), Rheumatoid arthritis (Trentham et al, 1993), multiple 
sclerosis (Khoury et al, 1990; Miller et al, 1992; Weiner et al, 1993) as well as the 
prevention of allergic and allograft rejection reactions (Savegh et al, 1992; Hancock et al, 
1993). Therefore, expressing a CTB-proinsulin fusion is an ideal approach for oral delivery 
of insulin. 

Chloroplast Genetic Engineering: Several environmental problems related to plant genetic 
engineering now prohibit advancement of this technology and prevent realization of its full 
potential. One such common concern is the demonstrated escape of foreign genes through 
pollen dispersal from transgenic crop plants to their weedy relatives creating super weeds or 
causing gene pollution among other crops or toxicity of transgenic pollen to non-target 
insects such as butterflies. The high rates of gene flow from crops to wild relatives (as high 
as 38% in sunflower and 50% in strawberries) are certainly a serious concern. Clearly, 
maternal inheritance (lack of chloroplast DNA inpollen) of the herbicide resistance gene via 
chloroplast genetic engineering has been shown to be a practical solution to these problems 
(Daniell et al, 1998). Another common concern is the sub-optimal production of Bacillus 
thuringiensis (B.t) insecticidal protein or reliance on a single (or similar) B.t. protein in 
commercial transgenic crops resulting in B.t. resistance among target pests. Clearly, 
different insecticidal proteins should be produced in lethal quantities to decrease the 
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development of resistance. Such hyper-expression of a novel B.t. protein in chloroplasts has 
resulted in 100% mortality of insects that are up to 40,000- fold resistant to other B.t. 
proteins (Kota et al. 1999). Therefore, chloroplast genome is an attractive target for 
expression of foreign genes due to its ability to express extraordinarily high levels of foreign 
5 proteins and efficient containment of foreign genes through maternal inheritance. 

When we developed the concept of chloroplast genetic engineering (Daniell and 
McFadden, 1988 U.S. Patents; Daniell, World Patent, 1999). It was possible to introduce 
isolated intact chloroplasts into protoplasts and regenerate transgenic plants (Carlson, 1 973). 
Therefore, early investigations on chloroplast transformation focused on the development 

10 of in organello systems using intact chloroplasts capable of efficient and prolonged 
transcription and translation (Daniell and Rebeiz, 1982; Daniell et al., 1983, 1986) and 
expression of foreign genes in isolated chloroplasts (Daniell and McFadden, 1987). 
However, after the discovery of the gene gun as a transformation device (Daniell, 1993), it 
was possible to transform plant chloroplasts without the use of isolated plastids and 

15 protoplasts. Chloroplast genetic engineering was accomplished in several phases. Transient 
expression of foreign genes in plastids of dicots (Daniell et al., 1990; Ye et aL, 1990) was 
followed by such studies in monocots (Daniell et al., 1991). Unique to the chloroplast 
genetic engineering is the development of a foreign gene expression system using 
autonomously replicating chloroplast expression vectors (Daniell et al., 1990). Stable 

20 integration of a selectable marker gene into the tobacco chloroplast genome (Svab and 
Maliga, 1993) was also accomplished using the gene gun. However, useful genes conferring 
valuable traits via chloroplast genetic engineering have been demonstrated only recently. For . 
example, plants resistant to B.t. sensitive insects were obtained by integrating the crylAc 
gene into the tobacco chloroplast genome (McBride et al., 1995). Plants resistant to B.t. 

25 resistant insects (up to 40,000 fold) were obtained by hyper-expression of the cryilA gene 
within the tobacco chloroplast genome (Kota et al., 1 999). Plants have also been genetically 
engineered via the chloroplast genome to confer herbicide resistance and the introduced 
foreign genes were maternally inherited, overcoming the problem of cut-cross with weeds 
(Daniell et al., 1998). Chloroplast genetic engineering has also been used to produce 

30 pharmaceutical products that are not used by plants (Guda et al. , 2000). Chloroplast genetic 
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engineering technology is currently being applied to other useful crops (Sidorov et al. 1 999; 
Daniell. 1999). 

Polymer-proinsulin Recombinant DNA Vectors: First we developed independent 
chloroplast vectors for the expression of insulin chains A and B as polymer fusion peptides, 
5 as it has been produced in E. coli for commercial purposes in the past. The disadvantage of 
this method is that E. coli does not form disulfide bridges in the cell unless the protein is 
targeted to the periplasm. Expensive in vitro assembly after purification is necessary for this 
approach. Therefore, a better approach is to express the human proinsulin as a polymer 
fusion protein. This method is better because chloroplasts are capable of forming disulfide 

10 bridges. Using a single gene, as opposed to the individual chains, eliminates the necessity 
of conducting two parallel vector construction processes, as is needed for individual chains. 
In addition, the need for individual fermentations and purification procedures is eliminated 
by the single gene method, Further, proinsulin products require less processing following 
extraction. Another benefit of using the proinsulin is that the C-peptide, which is an 

15 essential part the proinsulin protein, has recently been shown to play a positive role in 
diabetic patients (Ido et al, 1997). 

Recently, the human pre-proinsulin gene was obtained from Genentech, Inc. First, 
the pre-proinsulin was sub-cloned into pUC19 to facilitate further manipulations. The next 
step was to design primers to make chloroplast expression vectors. Since we are interested 

20 in proinsulin expression, the 5 f primer was designed to land on the proinsulin sequence. This 
FW primer eluded the 69 bases or 23 coded amino acids of the leader or pre-sequence of 
preproinsulin. Also, the forward primer included the enzymatic cleavage site for the protease 
factor Xa to avoid the use of cyanogen bromide. Beside the Xa-factor, a Smal site was 
introduced to facilitate subsequent subcloning. The order of the FWprimer sequence is Smal 

25 - Xa-factor - Proinsulin gene. The reverse primer includes BamHl and Xbal sites, plus a 
short sequence with homolgy with the pUC 1 9 sequence following the proinsulin gene. The 
297bp PCR product (Xa Pris) includes three restriction sites, which are the Smal site at the 
5 -end and Xbal/BamHl sites at the 3' end of the proinsulin gene. The Xa-Pris was cloned 
into pCR2.1 resulting in pCR2.1 - Xa- Pris (4.2kb). Insertion of Xa-Pris into the multiple 

30 cloning site of pCR2. 1, resulted in additional flanking restriction enzyme sites that will be 
used in subsequent sub-cloning steps. A GVGVP 50-mer was generated as described 
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previously (Daniel! et al. 1997). The ribosome binding sequence was introduced by 
digesting pUCs-10, which contains the RBS sequence GAAGGAG, with Nool and Hind III 
flanking sites. The plasmid pUC 19-50 was also digested with the same enzymes. The 
50mer gene was eluted from the gel and ligated to pUCs- 1 0 to produce pUCs- 1 0-5 Omer. The 
5 ligation step inserted into the 50mer gene a RBS sequence and a Smal site outside the gene 
to facilitate subsequent fusion to proinsulin. 

Another Smal partial digestion was performed to eliminate the stop codon of the 
biopolymer, transform the 50mer to a 40mer, and fuse the 40mer to the Xa-proinsulin 
sequence. The conditions for this partial digestion needed a decrease in DNA concentration 

10 and the 1:15 dilution of Smal. Once the correct fragment was obtained by the partial 
digestion of Smal (eliminating the stop codon but include the RBS site), it was ligated to the 
Xa-proinsulin fusion gene resulting in 1he construct pCR2.1-40-XaPris. Finally, the 
biopolymer (40mer) - proinsulin fusion gene was subcloned into pSBL-CtV2 (chloroplast 
vector) by digesting both vectors with Xbal. Then the fusion gene was ligated to the pSBL- 

15 CtV2 and the final vector was called pSBL-OC-XaPris. The orientation of the insert was 
checked with Nool: one the five colonies chosen had the correct orientation of the gene. 
The fusion gene was also subcloned into pLD-CtV vector and the orientation was checked 
with EooRl and PvuiL One of the four colonies had the correct orientation of the insert. 
This vector was called pLD-OOXaPris (Fig.2A). 

20 Both chloroplast vectors contain the 1 6S rRNA promoter (Pun) driving the selectable 

marker gene aadA (aminoglycoside adenyl transferase conferring resistance to 
spectinomycin) followed by the psbA 3 a region (the terminator from a gene coding for 
photosystem II reaction center components) from the tobacco chloroplast genome. The only 
difference between these two chloroplast vectors (pSBL and pLD) is the origin of DNA 

25 fragments. Both pSBLandpLD are universal chloroplast expression/integration vectors and 
can be used to transform chloroplast genomes of several other plant species (Daniell et al. 
1998) because these flanking sequences are highly conserved among higher plants. The 
universal vector uses trnA and tml genes (chloroplast transfer RNAs coding for Alanine and 
Isoleucine) from the inverted repeat region of the tobacco chloroplast genome as flanking 

30 sequences for homologous recombination as shown in Figs. 2A and 3B. Because the 
universal vector integrates foreign genes within the Inverted Repeat region of the chloroplast 
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genome, it should double the copy number of insulin genes (from 5000 to 10,000 copies per 
cell in tobacco). Furthermore, it has been demonstrated that homoplasmy is achieved even 
in the first round of selection in tobacco probably because of the presence of a chloroplast 
origin of replication within the flanking sequence in the universal vector (thereby providing 
5 more templates for integration). Because of these and several other reasons, foreign gene 
expression was shown to be much higher when the universal vector was used instead of the 
tobacco specific vector (Guda et aL, 2000). 

DNA sequence of the polymer-proinsulin fusion was determined to confirm the 
correct orientation of genes, in frame fusion and lack of stop codons in the recombinant 

10 DNA constructs. DNA sequencing was performed using a Perkin Elmer AB1 prism 373 
DNA sequencing system using a ABI Prism Dye Termination Cycle Sequencing Kit. The 
kit uses AmpliTaq DNA polymerase. Insertion sites at both ends were sequenced using 
primers for each strand. Expression of all chloroplast vectors was first tested in E. coli 
before their use in tobacco transformation because of the similarity of protein synthetic 

1 5 machinery (Brisey et aL 1 997). For Escherichia coli expression XL- 1 Blue strain was used. 
E. coli was transformed by standard CaCLj transformation procedures. 
Expression and Purification of the Biopolymer-proinsulin fusion protein: Terrific broth 
growth medium was inoculated with 40/zl of Ampicillin (lOOmg/ml) and 40//1 of the XL-1 
Blue MRF To strain of E. coli containing pSBL-OC-XaPris plasmid. Similar inoculations 

20 were made for pLD-OC-XaPris and the negative controls, which included both plasmids 
containing the gene in the reverse orientation and the E. coli strain without any plasmid. 
Then, 24hr cultures were centrifuged at 1 3 ,000 rpm for 3 min. The pellets were resuspended 
in 50Qul of autoclaved dH z O and transferred to 6ml Falcon tubes. The resuspended pellet 
was sonicated, using a High Intensity Ultrasonic processor, for 15 sec at an amplitude of 40 

25 and then 15 sec on ice to extract the fusion protein from cells. This sonication cycle was 
repeated 15 times. The sonicated samples were transferred to microcentrifuge tubes and 
centrifuged at 4°C at 10 5 000g for 10 min to purify the fusion protein. After centrifugation, 
the supernatant were transferred to microcentrifudge tubes and an equal volume of 2XTN 
buffer (1 OOmM TrisHCI, pH 8, 100 mM NaCl) was added. Tubes were warmed at 42°C for 

30 25 min to induce biopolymer aggregation. Then the fusion protein was recovered by 
centrifuging at 2,500 rpm at 42 °C for 3 min. The recovered fusion protein was resuspended 
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in 100/il of cold water. The purification process was repeated twice. Also, the fusion 
protein was recovered by using 6M Guanidine hydrochloride phosphate buffer, pH 7.0 
(instead of water), to facilitate stability of insulin. New cultures were incubated for this step 
following the same procedure as described above, except that the pSBL-OC-XaPris 
expressing cells were incubated for 24, 48 and 72 hrs ; Cultures were centrifuged at 4,000 
rpm for 12 min and the pellet was resuspended in 6M Guanidine hydrochloride phosphate 
buffer, pH 7.0, and then sonicated as described above. After sonication, samples were run 
in a 16.5% Tricine gel, transferred to the nitrocellulose membrane, and immunoblotting was 
performed the following day. 

A 15% glycine gel was run for 6h at recommended voltage as shown in Fig. 1 . Two 
different methods of extraction were used. It was observed that when the sonic extract is in 
6M Guaoicine Hydrochloride Phosphate Buffer, pH7.0, the molecular weight changes from 
its original and correct MW 24 kD to a higher MW of approximately 30 kDa (Fig. 1C. I). 
This is probably due to the conformation that the biopolymer takes under this kind of buffer, 
which is used to maximize the extraction of proinsulin. 

The gel was first stained with 0.3M CuCl 2 and then the same gel was stained with 
Commassie R-250 Stabling Solution for an hour and then destained for 1 5 min first, and then 
overnight CuCl 2 creates a negative stain (Lee et al. 1987). Polymer proteins (without 
fusion) appear as clear bands against a blue background in color or dark against a light 
semiopaque background (Fig. 1 A). This stain was used because other protein stains such as 
Coomassie Blue R250 does not stain the polymer protein due to the lack of aromatic side 
chains (McPherson et al. 5 1992). Therefore, the observation of the 24 kDa protein in R250 
stained gel (Fig. IB) is due to the insulin fusion with the polymer. This observation was . 
further confirmed by probing these blots with the antihuman proinsulin antibody. As 
anticipated, the polymer insulin fusion protein was observed in western blots as shown in 
Fig. 1 C, even though the binding of antibody was less efficient (probably due to concealment 
of insulin epitopes by the polymer). Larger proteins observed as shown in Fig. 1C II are 
tetratner and hexamer complexes of proinsulin. 

It is evident that the insulin-polyer fusion proteins are stable in E. coli. Confirming 
this observation, recently another lab has shown that the PBP polymer protein conjugates 
(with thioredoxin and tendamistat) undergo thermally reversible phase transition, retaining 
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the transition behavior of the free polymer (Meyer and Chikoti, 1999). These results clearly 
demonstrate that insulin fusion has not affected the inverse temperature transition property 
of the polymer. One of the concerns is the stability of insulin at temperatures used for 
thermally reversible purification. Temperature induced production of human insulin has 
5 been in commercial use (Schmidt et al. 1999). Also, the temperature transition can be 
lowered by increasing the ionic strength of the solution during purification of this PSP 
(McPherson et al, 1996). Thus, GVGVP-fusion could be used to purify a multitude of 
economically important proteins in a simple inexpensive step. 

Biopolymer-proinsulin fusion gene expression in chloroplast: As described in section d, 
10 pSBL-OC-R40XaPris vector and pLD-OC-R40XaPris vectors were bombarded into the 
tobacco chloroplasts genome via particle bombardment (DanielL, 1997). PCR was 
performed to confirm biopolymer-proinsulin fusion gene integration into chloroplast genome. 
The PCR products were examined in 0.8% agarose gels. Fig. 2A shows primers landing sites 
and expected PCR products. Fig. 2B shows the 1 .6 kbp PCR product, confirming integration 
15 of the aadA gene into the chloroplast genome. This 1.6kb product is seen in all clones 
except L9 3 which is a mutant We used primers 2P and 2M to confirm integration of both 
the aadA and biopolymer-proinsulin fusion gene. The 1.3 kbp product corresponds to the 
native chloroplast fragment and the 3.5 kbp product corresponds to the chloroplast genome 
that has integrated all three genes as shown in Figs. 2C amd D. All the clones examined at 
20 this time show heteroplasmy, exce[t c;pmes :8d om Fog/ 2C, and S41b in Fig. 2D 5 which 
show almost homoplasmy. 

Protease Xa Digestion of the Biopolymer-proinsulin fusion protein and Purification of 
Proinsulin: Factor Xa was purchased from New England Biolabs at a concentration of 1 .0 
mg/ml. The Factor Xa is supplied in 20mM HEPES, SOOmM, NaCl, 2mM CaCl 2 , 50% 

25 glycerol, (pH 8.0). The reaction was carried out in a 1 : 1 ratio of fusion protein to reaction 
buffer. The reaction buffer was made with 20mM Tris-HCI, 100mMNaCl s 2mMCaCl 2 ,(pH 
8.0). The enzymatic cleavage of the fusion protein to release the proinsulin protein from the 
(GWP)^ was initiated by adding the protease to the purified fusion protein at a ratio (ww) 
of approximately 1,500. This digestion was continued for 5 days with mild stirring at 4°C. 

30 Cleavage of the fusion protein was monitored by SDS-PAGE analysis. After the cleavage, 
the same conditions are used for purification of the proinsulin protein. The purification steps 
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are the same as for the purification of the fusion protein, except that instead of recovering 
the pellet, the supernatant is saved. We detected cleaved proinsulin in the extracts isolated 
in 6M guanidine hydrochloride buffer as shown in Fig, 1C 1 L Conditions can be estimized 
for complete cleavage. The Xa protease has been successfully used to cleave (GVGVP)^- 
5 GST fusion (McPherson et al. 1 992) . Therefore, cleavage of proinsulin from GVG VP using 
the Xa protease does not pose problems. 

Vector for CTB expression in chloroplasts: The leader sequence (63 bp) of the native CTB 
gene (372 bp) was deleted and a start codon (ATG) introduced at the 5 1 end of the remaining 
CTB gene (309 bp). Primers were designed to introduce a rbs site 5 bases upstream of the 

1 0 start codon. The 5 f primer (38mer) was designed to and on the start codon and the 5'-end of 
the CTB gene. This primer had an Xbal site at the 5 f -end, the rbs site [GGAGG], a 5 bp 
breathing space followed by the first 20 bp of the CTB gene. The 3' primer (32mer) was 
designed to land on the 3' end of the CTB gene and it introduced restriction sites at the 3' end 
to facilitate subcloning. The 347 bp rCTB PCR product was subcloned into pCR2.1 

15 resulting in pcCR2.1-rCTB. The final step was insertion of rCTB into the Xbal site of the 
universal or tobacco vector (pLB-CtV2) that allows the expression of the construct in E. coli 
and chloroplasts. Restriction enzyme digestion of the pLD-LH-rCTB vector with BamHl 
was performed to confirm the correct orientation of the inserted fragment in the vector. 

Because of the similarity of protein synthetic machinery, expression of the chloroplast 

20 vector was tested in E, coli before its use in tobacco transformation. For Escherichia coli 
expression the XL-1 Blue MRF TO strain was used. E. coli was transformed by standard 
CaCl 2 transformation procedures. Transformed E. coli (24 hrs culture and 48 hrs culture in 
100ml TB with lOOmg/ml ampicillin) and untransformed E. coli (24 hrs culture and 48 hrs 
culture in 100ml TB with 12.5mg/ml tetracycline) was then centrifuged at 10000 x g in a 

25 Beckman GS-1 5R centrifuge for 1 5 min. The pellet was washed with 200mM Tris-Cl twice 
and resuspended in 500//1 extraction buffer (200mM Tris-Cl, pH8.0, lOOmM NaCl; lOmM 
EDTA, 2mM PMSF) and then sonicated using the Autotune Series High Intensity Ultrasonic 
Processor. Then, IQOfA aliquots of the sonicated transformed and untransformed cells 
[containing 50 - 100/ug of crude protein extract as determined by Bradford protein assay 

30 (Bio-Rad Inc)] and purified CTB (Sigma C-9903) were boiled with 2X SDS sample buffer 
and separated on a 15% SDS-PAGE gel in Tris-glycine buffer (25mM Tris, 250 mM glycine. 
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pH8.3, 0, 1 % SDS). The separated protein was then transferred to a nitrocellulose membrane 
by electro blotting using the Trans-Blot Electrophoretic Transfer Cell (Bio-Rad Inc.). 
Immunoblot detection of CTB expression in £• coli: Nonspecific antibody reactions were 
blocked by incubation of the membrane in 25ml of 5% non-fat dry milk in TBS buffer for 
5 1 - 3 hrs on a rotary shaker (40 rpm), followed by washing in TBS buffer for 5 min. The 
membrane was then incubated for an hour with gentle agitation in 30 ml of a 1 :5000 dilution 
of rabbit anti-cholera antiserum (Sigma C-3062) in TBS with Tween-20 [TBST] (containing 
1% non-fat dry milk) followed by washing 3 times in TBST buffer. The membrane was 
incubated for an hour at room temperature with gentle agitation in 30 ml of a 1:10000 

10 dilution of mouse anti-rabbit IgG conjugated with alkaline phosphatase in TBST. It was then 
washed thrice with TBST and once with TBS followed by incubation in the Alkaline 
Phosphatase Color Development Reagents, BCIP/NBT in AP color development Buffer 
(Bio-Rad, Inc.) for an hour. Immunoblot analysis snows the presence of 11.5 kDa 
polypeptide for purified bacterial CTB and transformed 24h/ 48h cultures (Fig. 3 A, lanes 2, 

15 3 and 5). The 48h culture appears to express more CTB than that of the 24h culture 
indicating the accumulation of the CTB protein over time. The purified bacterial CTB (45 
Kda) dissociated into monomers (1 1 .5 KDa each) due to boiling prior to SDS PAGE. These 
results indicate that the pLD-LH-CTB vector is expressed in E. coli. Because of the 
similarity of the E. coli protein synthetic machinery to that of chloroplasts, chloroplast 

20 expression of the above vector should be possible. 

CTB expression in chloroplasts: As described below, pLD-LH-CTB was integrated into 
the tobacco chloroplast genome via particle bombardment (Daniell, 1997). PCR analysis 
was performed to confirm chloroplast integration. Fig. 3B shows primer landing sites and 
size of expected products. PCR analysis of clones obtained after the first round of selection 

25 was carried out as described below. PCR products were examined on 0.8% agarose gels. 
The PCR results (Fig. 3C) show that clones 1 and 5 that do not show any product are 
mutants while clones 2, 3, 4, 6, 7, 8, 9, 10 and 11 that gave a 1.65 kbp product are 
transgenic. As expected, lanes 13-15 did not give any PCR product, confirming that the 
PCR reaction was not contaminated. Because primers 3P & 3M land on the aadA gene and 

30 on the chloroplast genome, all clones that show PCR products have integrated the CTB gene 
and the selectable marker into the chloroplast genome. Clones that showed chloroplast 
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integration of the CTB gene were moved to the second round of selection to increase copy 
number. PCR analysis of clones obtained after the second round of selection was also 
carried out. PCR results shown in Fig. 3D indicate that clone 5 does not give a 3 kbp 
product indicating that it is a mutant as observed earlier. Other clones give a strong 3 kbp 
5 product and a faint 1.3 kbp (similar to the 1.3 kbp untransfonned plant product) product, 
indicating that they are transgenic but not yet homoplasmic. Complete homoplasmy can be 
accomplished by several more rounds of selection or by germinating seeds from transgenic 
plants on 500 yug/ml of spectinomycin. 

CTB-Proinsulin Vector Construction: The chloroplast expression vector pLD-CTB-Proins 

10 was constructed as follows. First, both proinsulin and cholera toxin B-subunit genes were 
amplified from suitable DNA using primer sequences. Primer 1 contains the GGAGG 
chloroplast preferred ribosome binding site five nucleotides upstream of the start codon 
(ATG) for the CTB gene and a suitable restriction enzyme site (Spel) for insertion into the 
chloroplast vector. Primes: 2 eliminates the stop codon and adds the first two amino acids 

15 of a flexible hinge tetrapeptide GPGP as reported by Bergerot et al. (1997), in order to 
facilitate folding of the CTB-proinsulin fusion protein. Primer 3 adds the remaining two 
amino acids for the hinge tetra-peptide and eliminates the pre-sequence of the pre-proinsulin. 
Primer 4 adds a suitable restriction site (Spel) for subcloning into the chloroplast vector. 
Amplified PCR products were inserted into the TA cloning vector. Both the CTB and 

20 proinsulin PCR fragments were excised at the Smal and Xbal restriction sites. Eluted 
fragments were ligated into the TA cloning vector. Interestingly, all white colonies showed 
the wrong orientation for CTB insert while three of the five blue colonies examined showed 
the right orientation of the CTB insert The CTB-proinsulin fragment was excised at the 
EcoRl sites and inserted into EcoRl digested dephosphorolated pLD vector. Resultant 

25 onicroplast integration expression vector, pLD-CTB-Proins will be tested for expression in 
E. coli by western blots. After confirmation of expression of CTB-proinsulin fusion in E. 
coliy pLD-CTB-Proins will be bombarded into tobacco cells as described below. 
Optimization of fusion gene expression: It has been reported that foreign genes are 
expressed between 5% (cryLAC, cryllA) and 30% (uldA) in transgenic chloroplasts (Daniell, 

30 1999). If the expression levels of the CTB-Proinsulin or polymer-proinsulin fusion proteins 
are low, several approaches will be used to enhance translation of these proteins. In 
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chloropiast, transcriptional regulation of g?ne expression is less important, although some 
modulations by light and developmental conditions are observed (Cohen and Mayfield, 
1 997). RNA and protein stability appear to be less important because of observation of large 
accumulation of foreign proteins (e.g. GUS up to 30% of total protein) and tpsl transcripts 
5 16,966-fold higher than the highly expressing nuclear transgenic plants. Chloropiast gene 
expression is regulated to a large extent at the post-transcriptional level. For example, 5 f 
UTRs are used for optional translation of chloropiast mRNAs. Shine-Delgarno (GGAGG) 
sequences as well as a stem-loop structure located 5' adjacent to the SD sequence are used 
for efficient translation. A recent study has shown that insertion of the psbA 5' UTR 

10 downstream of the 16S rRNA promoter enhanced translation of a foreign gene (GUS) 
hundred-fold (Eibl et al. 1999). Therefore, the 85-bp tobacco chloropiast DNA fragment 
(1595 - 1680) containing 5' psbA UTR will be amplified using the following primers 
cctttaaaaagccttccattttctattt, gccatggtaaaatctfggtttatta. This PCR product will be inserted 
downstream of the 16S rRNA promoter to enhance translation of the proinsulin fusion 

15 proteins. 

Yet another approach for enhancement of translation is to optimize codon 
compositions of these fusion protein. Since both fusion proteins are expressed well in E. 
coli, we expected efficient expression in chloroplasts. However, optimizing codon 
compositions of proinsulin and CTB genes to march the psbA gene could further enhance 

20 the level of translation. Although rbcL (RuBisCO) is the most abundant protein on earth, . 
it is not translated as frequently as the psbA gene due to the extremely high turnover of the 
psbA gene product. The psbA gene is under stronger selection for increased translation 
efficiency and is the most abundant thylakoid protein. In addition, codon usage in higher 
plant chloroplasts is biased towards the NNC codon of 2-fold degenerate groups (i.e. TTC 

25 over TTT, GAC over GAT, CAC over CAT, AAC over AAT, ATC over ATT, ATA etc.). 
This is in addition to a strong bias towards T at third position of 4-fold degenerate groups. 
There is also a context effect that should be taken into consideration while modifying 
specific codons. The 2-fold degenerate sites immediately upstream from a GNN codon do 
not show this bias towards NNC, (TTT GGA is preferred to TTC GGA while TTC CGT is 



30 preferred to TTT CGT TTC AGT to TTT AGT and TTC TCT to TTT TCT). In addition, 
highly expressed chloropiast genes use GNN more frequently than other genes. The web site 
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may be used optimize codon composition by comparing different 

species. Abundance of amino acids in chloroplasts can be taken into consideration 
(pathways compartmentalized in plastids as opposed to those that are imported into plastids). 

5 As far as the biopolymer gene is concerned, we observed incomplete translation 

products in plastids when we expressed the 120mer gene (Guda et al. 2000). Therefore, 
while expressing the polymer-proinsulin fusion protein, we decreased the length of the 
polymer protein to 40mer, without losing the thermal responsive property. In addition, 
optimal codons for glycine (GGT) and valine (GTA), which constitute 80% of the total 

10 amino acids of the polymer, have been used. In all nuclear encoded genes glycine make up 
147/1 000 amino acids while in tobacco chloroplasts it is 129/1000. Highly expressing genes 
likepsbA andfbcL of tobacco makeup 192 and 190 gly/1000. Therefore, glycine may not 
be a limiting factor. Nuclear genes use 52/1000 proline as opposed to 42/1000 in 
chloroplasts. However, currently used codon for proline (CCG) can be modified to CCA or 

15 CCT to further enhance translation. It is known that pathways for proline and valine are 
compartmentalized in chloroplasts (Guda et al. 2000). Also, proline is known to accumulate 
in chloroplasts as an osmoprotectant (Daniell et al. 1994). 

Bombardment and Regeneration of Chloroplast Transgenic Plants: Tobacco (Nicotiana 
tabacum var. Petit Havana) and nicotine free edible tobacco (LAMD 665,, gift from Dr. 

20 Keith Wycoff . Planet Biotechnology) plants are grown aseptically by germination of seeds 
on MSO medium. THis medium contains MS salts (4.3 g/liter), B5 vitamin mixture (myo- 
inositol, 100mg/liter;thiamine-HCl. lOmg/liter nicotinic acid. 1 mg/liter; pyridoxine-HCL. 
1 mg/liter), sucrose (30 g/liter) and phytagar (6 g/liter) at pH 5.8. Fully expanded, dark 
green leaves of about two month old plants are used for bombardment. 

25 Leaves are placed abaxial side up on a Whatman No. 1 filter paper laying on the 

RMOP medium (Darnell, 1993) in standard petri plates (100x15 mm) for bombardment 
Tungsten (1 pan) or Gold (0.6 /zm) microprojectiles are coated with plasmid DNA 
(chloroplast vectors) and bombardments carried out with the biolistic device PDSIOOO/He 
(Bio-Rad) as described by Daniell (1997), Following bombardment, petri plates are sealed 

30 with parafilm and incubated at 24°C under 12 h photoperiod. Two days after bombardment, 
leaves are chopped into small pieces of ~5 mm 2 in size and placed on the selection medium 



WO 01/72959 

26 



(RMOP containing 500 jug/ml of spectinomycin dihydrochloride) with abaxia] side touching 
the medium in deep (100x25 mm) petri plates (~10 pieces per plate). The regenerated 
spectinomycin resistant shoots are chopped into small pieces (-2mm 2 ) and subcloned into 
fresh deep petri plates (~5 pieces per plate) containing the same selection medium. Resistant 
5 shoots from the second culture cycle arbe transferred to the rooting medium (MSO medium 
supplemented with DBA. 1 mg/liter and spectinomycin dihydrochloride, 500 mg/liter). 
Rooted plants are transferred to soil and grow at 26°C under continuous lighting conditions 
for further analysis. 

Polymerase Chain Reaction: PCR is performed using DNA solated from control and 

10 transgenic plants to distinguish a) true chloroplast transformants from mutants and b) 
chloroplast transformants from nuclear transformants. Primers for testing the presence of 
the aadA gene (that confers spectinomycin resistance) in transgenic pants are landed on the 
aadA coding sequence and 16S rRNA gene (primers 1P&1M.). To test chloroplast 
integration of the insulin gene, one primer lands on the aadA gene, while another lands on 

15 the native chloroplast genome (primers 3P&3M) as shown in Figs. 2 A and 3B. No PCR 
product is obtained with nuclear transgenic plants using this set of primers. The primer set 
(2P & 2M, in Figs. 2A and 3B) is used to test integration of the entire gene cassette without 
internal deletion or looping out during homologous recombination. A similar strategy has 
been used successfully to confirm chloroplast integration of foreign genes (Daniell et al., 

20 1998; Kota et al, 1999; Guda et al, 1999). This screening is essential to eliminate mutants 
and nuclear transformants* 

Total DNA from unbombarded and transgenic plants is isolated as described by 
Edwards et al., (1991) to conduct PCR analyses in transgenic plants. PCR reactions are 
performed in a total volume of 50 jA containing approximately 10 ng of template DNA and 

25 1 ijM of each primer in a mixture of 300 juM of each deoxynucleotide (dNTPs), 200 mM 
Tris (pH 8.8), 100 mM KCI, 100 mM CNH 4 ) 2 S0 4 , 20 mM MgS0 4 , 1% Triton X-100, 1 
mg/ml nuclease-free BSA and 1 or 2 units of Taq Plus polymerase (Stratagene, La Jolla, 
CA). PCR is carried out in the Perkin Elmer's GeneAmp PCR system 2400, by subjecting 
the samples to 94°C for 5 min and 30 cycles of 94°C for 1 min, 55°C for 1.5 min, 72°C for 

30 1.5 or 2 min followed by a 72°C step for 7 min. PCR products are analyzed by 
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electrophoresis on 0.8% agarose gels. Chloroplast transgenic plants containing the 
proinsulin gene are then moved to second round of selection to achieve homoplasmy. 
Southern Blot Analysis: Southern blots are performed to determine the copy number of the 
introduced foreign gene per cell as well as to test homoplasmy. There are several thousand 
copies of the chloroplast genome present in each plant cell. Therefore, when foreign genes 
are inserted into the chloroplast genome, it is possible that some of the chloroplast genomes 
have foreign genes integrated while others remain as the wild type (heteroplasmy). 
Therefore, to ensure that only the transformed genome exists in cells of transgenic plants 
(homoplasmy), the selection process is continued To confirm that the wild type genome 
does not exist at the end of the selection cycle, total DNA from transgenic plants should be 
probed with the chloroplast border (flanking) sequences (the trnl-trnA fragment, Figs. 2A 
and 3B). If wild type genomes are present (heteroplasmy), the native fragment size is 
observed along with transformed genomes. Presence of a large fragment (due to insertion 
of foreign genes within the flanking sequences) and absence of the native small fragment 
confirms homoplasmy (Daniell et al., 1998; Kota et al., 1999; Guda et aL, 1999). 

The copy number of the integrated gene is determined by establishing homoplasmy 
form the transgenic chloroplast genome. Tobacco chloroplasts contain 5000-1 0,000 copies 
of their genome per cell (Daniell et aL, 1998). If only a fraction of the genomes are actually 
transformed, the copy number, by default, must be less than 10,000. By establishing that in 
the trangenics the insulin inserted transformed genome is the only one present, one can 
establish that the copy number is 5000-10,000 per cell. This is usually achieved by 
digesting the total DNA with a suitable restriction enzyme and probing with the flanking 
sequences that enable homologous recombination into the chloroplast genome. The native 
fragment present in the control should be absent in the transgenics. The absence of native 
fragment proves that only the transgenic chloroplast genome is present in the cell and there 
is no native, untransformed, chloroplast genome, without the insulin gene present. This 
establishes the homoplasmic nature of the transformants, simultaneously, thereby providing 
an estimate of 5000-10,000 copies of the foreign genes per cell. 

Total DNA is extracted from leaves of transformed and wild type plants using the 
CTAB procedure outlined by Rogers and Bendich (1988). Total DNA is digested with 
suitable restriction en2ymes, electrophoresed on 0.7% agarose gels and transferred to nylon 
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membranes (Micron Separation Inc., Westboro, MA). Probes are labeled with 32 P-dCTP 
using the random-primed procedure (Promega). Pre-hybridization and hybridization steps 
are carried out at 42°C for 2 h and 16 h, respectively. Blots are soaked in a solution 
containing 2X SSC and 0.5% SDS for 5 min followed by transfer to 2X SSC and 0. 1% SDS 
5 solution for 15 min at room temperature. Then, blots are incubated in hybridization bottles 
containing 0.1X SSC and 0.5% SDS solution for 30 min at 37°C followed by another step 
at 68°C for 30 min, with gentle agitation. Finally, blots are briefly rinsed in 0.1X SSC 
solution, dried and exposed to X-ray film in the dark- 
Northern Blot Analysis : Northern blots are performed to test the efficiency of transcription 

10 of the proinsulin gene fused with CTB or polymer genes. Total RNA is isolated from 150 
mg of frozen leaves by using the "Rneasy Plant Total RNA Isolation Kit" (Qiagen Inc., 
Chatsworth, CA). RNA (10-40 mg) is denatured by formaldehyde treatment, separated on 
a 1 .2% agarose gel in the presence of formaldehyde and transferred to a nitrocellulose 
membrane (MSI) as described in Sambrook et al. (1989). Probe DNA (proinsulin gene 

15 coding region) is labeled by the random-primed method (Promega) with 32 P-dCT isotope. 
The blot is pre-hybridized, hybridized and washed as described above for southern blot 
analysis. Transcript levels are quantified by the Molecular Analyst Program using the GS- 
700 Imaging Densitometer (Bio-Rad, Hercules, CA). 

Polymer-insulin fusion protein purification, quantitation and characterization: Because 
20 polymer insulin fusion proteins exhibit inverse temperature transition properties as shown 
in Figs. 1 A and B, they are purified from transgenic plants essentially following the same 
method for polymer purification from transgenic tobacco plants (Zhang et al., 1996). 
However, an additional step is introduced to take advantage of the compartmentalization of 
insulin polymer fusion protein within chloroplasts. Chloroplasts are first isolated from crude 
25 homogenate of leaves by a simple centrifugation step at 1500Xg. This eliminates most of 
the cellular organelles and proteins (Daniell at al., 1983, 1986). Then, chloroplasts are burst 
open by resuspending them in a hypotonic buffer (osmotic shock). This is a significant 
advantage because there are fewer soluble proteins inside chloroplasts when compared to 
hundreds of soluble proteins in the cytosol. Polymer extraction buffer contains 50 mM Tris- 
30 HC1, pH 7.5, 1% 2-mecaptoethanol, 5mM EDTA and 2mM PMSF and 0.8 M NaCl. The 
homogenate is then centrifuged at 10,000 g for 10 min (4°C), and the pellet discarded. The 
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supernatant is incubated at 42°C for 30 minutes and then centrifuged immediately for 3 
minutes at 5,000 g (room temperature). If insulin is found to be sensitive to this temperature, 
1*! is lowered by increasing salt concentration (McPherson et al. 5 1996). The pellet 
containing the insulin-polymer fusion protein is resuspended in the extraction buffer and 
5 incubated on ice for 1 0 minutes. The mixture is centrifuged at 12,000 g for 10 minute (4°C). 
The supernatant is then collected and stored at -20°C. The purified polymer insulin fusion- 
protein is electrophoresed in a SDS-PAGE gel according to Laemml (1970) and visualized 
by either staining with 0.3 M CuCl 2 (Lee et al., 1987) or transferred to nitrocellulose 
membrane and probed with antiserum raised against the polymer or insulin protein as 
10 described below. Quantification of purified polymer proteins may then be carried out by 
densitometry. 

After electrophoresis, proteins are transferred to a nitrocellulose membrane 
electrophoretically in 25 mM Tris, 192mM glycine, 5% methanol (pH 8.3). The filter is 
blocked with 2% dry milk in Tris-buffered saline for two hours at room temperature and 

15 stained with antiserum raised against the polymer AVGVP (kindly provided by the 
University of Alabama at Birmingham, monoclonal facility) overnight in 2% dry milk/Tris 
buffered saline. The protein bands reacting to the antibodies are visualized using alkaline 
phosphatase-linked secondary antibody and the substrates nitroblue tetrazolium and 5- 
bromo-4-cMoro-3-indolyl-phosphate (Bio-Rad). Alternatively, for insulin-polymer fusion 

20 proteins, a Mouse anti-human proinsulin (IgGl) monoclonal antibody is used as a primary 
antibody. To detect the binding of the primary antibody to the recombinant proinsulin, a 
Goat anti-mouse IgG Horseradish Peroxidase Labeled monoclonal antibody (HPR) is used. 
The substrate used for conjugation with HPR is 3,3', 5,5 -Tetramethylbenzidine. All products 
are available from American Qualex Antibodies, San Clemente, CA. As a positive control, 

25 human recombinant proinsulin from Sigma may be used This human recombinant 
proinsulin was expressed in E. coli by a synthetic proinsulin gene. Quantification of purified 
polymer fusion proteins is carried out by densitometry using Scanning Analysis software 
(BioSoft, Ferguson, MO) installed on a Macintosh LC m computer (Apple Computer, 
Cupertino, USA) with a 160-Mb hard disk operating on a System 7.1, connected by SCSI 

30 interface to a Relisys RELI 2412 Scanner (Relisys, Milpitas, CA). Total protein contents is 
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then determined by the dye-binding assay using reagents supplied in kit fro Bio-Rad, with 
bovine serum albumin as a standard. 

Characterization of CTB expression: CTB protein levels in transgenic plants are 
determined using quantitative ELISA assays. A standard curve is generated using known 

5 concentrations of bacterial CTB. A 96-well microliter plate padded with 100 jul/well of 
bacterial CTB (concentrations in the range of 10 - 1000 ng) is incubated overnight at 4°C 
The plate is washed thrice with PBST (phosphate buffered saline containing 0.05% Tween- 
20). The background is blocked by incubation in 1 % bovine serum albumin (BSA) in PBS 
(300 1/well) at 37°C for 2 h followed by washing 3 times with PBST. The plate is incubated 

0 in a 1 :8,000 dilution of rabbit anti-cholera toxin antibody (Sigma C-3062) (100 >ul/well) for 
2 h at 37°C, followed by washing the wells three times with PBST. The plate is incubated 
with a 1 : 80,000 dilution of anti-rabbit IgG conjugated with alkaline phoshatase (100 yul/well) 
for 2 h at 37°C and washed thrice with PBST. Then, 100 fA alkaline phosphatase substrate 
(Sigma Fast p-nitrophenyl phosphate tablet in 5 ml of water is added and the reaction 

5 stopped with 1M NaOH (50 /^l/well) when absorbancies in the mid-range of the titration 
reach about 2.0, or after 1 hour, whichever comes first. Hie plate is then read at 405nrn. 
These results are used to generate a standard curve from which concentrations of plant 
protein can be extrapolated. Thus, total soluble plant protein (concentration previously 
determined using the Bradford assay) in bicarbonate buffer, pH 9.6 (15 nM Na^Oa, 35mM 

D NaHC0 3 ) is loaded at 100 plant ^1/well and the same procedure as above can be repeated. 
The absorbance values are used to determine the ratio of CTB protein to total soluble plant 
protein, using the standard curve generated previously and the Bradford assay results. 
Inheritance of Introduced Foreign Genes: In initial tobacco transfonnants, some are 
allowed to self-pollinate, whereas others are used in reciprocal crosses with control tobacco 

5 (transgenics as female acceptors and pollen donors: testing for maternal inheritance). 
Harvested seeds (Tl) are germinated on media containing spectinomycin. Achievement of 
homoplasmy and mode of inheritance can be classified by looking at germination results. 
Homoplasmy is indicated by totally green seedlings (Daniell et al., 1998) while heteroplasmy 
is displayed by variegated leaves (lack of pigmentation, Svab & Maliga, 1993). Lack of 

) variation in chlorophyll pigmentation among progeny also underscores the absence of 
position effect, an artifact of nuclear transformation. Maternal inheritance may be 
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demonstrated by scie transmission of introduced genes via seed generated on transgenic 
plants, regardless of pollen source (green seedlings on selective media). When transgenic 
pollen is used for pollination of control plants, resultant progeny does not contain resistance 
to chemical in selective media (will appear bleached; Svab and Maliga, 1993). Molecular 
analyses confirms transmission and expression of introduced genes, and T2 seed is generated 
from those confirmed plants by the analyses described above. 

Comparison of Current Purification with Polymer-based Purification Methods: It is 

important to compare purification methods to test yield and purity of insulin produced in E. 
coli and tobacco. One liter of pSBL containing bacteria is grown in LB/ampicillin (100 
/zg/ml) overnight and the fusion protein expressed. Cells are harvested by centrifugation at 
5000 X g for 10 min at 4°C, and the bacterial pellets resuspended in 5 ml/g (wet wt. 
Bacteria) of 100 mM Tris-HCl, pH 7.3. Lysozyme is added at a concentration of 1 mg/ml 
and placed on a rotating shaker at room temperature for 15 min. The lysate is subjected to 
probe sonication for two cycles of 30 s on/30 s off at 4°C. Cellular debris is removed by 
centrifugation at 1000 X g for 5 min at 4°C. Insulin polymer fusion protein is purified by 
inverse temperature transition properties (Daniell et al., 1997). Alternatively, the fusion 
protein is purified according to Cowley and Mackin (1997). The supernatant is retained and 
centrifuged again at 27000 X g for 15 min at 4°C to pellet the inclusion bodies. The 
supernatant is discarded and the pellet resuspended in 1 ml/g (original wt. Bacteria) of dH 2 0, 
aliquoted into microcentrifuge tubes as 1 ml fractions, and then centrifuged at 1 6000 X g for 
5 min at 4°C, The pellets are individually washed with 1 ml of 100 mM Tris-HCl, pH 8.5, 
1M urea, 1-1 Triton X-100 and again washed with 100 mM Tris HC1 pH8.5, 2 M urea, 2% 
Trinton X-100. The pellets are resuspended in 1 ml of dH 2 0 and transferred to a pre- 
weighted 30 ml Corex centrifuge tube. The sample is centrifuged at 15000 X g for 5 min at 
4°C, and the pellet resuspended in 10 ml/g (wet wt. pellet) of 70% formic acid. Cyanogen 
bromide is added to a final concentration of 400 mM and the sample incubated at room 
temperature in the dark for 16 h. The reaction is stopped by transferring the sample to a 
round bottom flask and removing the solvent by rotary evaporation at 50°C. The residue is 
resuspended in 20 ml/g (wet wt. pellet) of dH 2 0, shell frozen in a dry ice ethanol bath, and 
then lyophilized. The lyophilized protein is dissolved in 20 ml/g (wet wt. pellet) of 500 mM 
Tris-HCl, pH 8.2, 7 M urea. Oxidative sulfitolysis is performed by adding sodium sulfite 



WO 01/72959 TCT/USO 1/06288 

32 



and sodium tetrathionate to Final concentrations of 100 and 10 mM, respectively, and 
incubating at room temperature for 3 h. This reaction is then stopped by freezing on dry ice. 

Purification and folding of Human Proinsulin: The S-sulfonated material is applied to a 
5 2 ml bed of Sephadex G-25 equilibrated in 20 mM Tris-HCl, pH 8.2, 7 M urea, and then 
washed with 9 vols of 7 M urea. The collected fraction is then applied to a Pharmacia Mono 
Q HR 5/5 column equilibrated in 20 mM Tris-HCl, pH 8.2, 7 M urea at a flow rate of 1 
ml/min. A linear gradient leading to final concentration of 0.5 M NaCl is used to elute the 
bound material* 2 min (2 ml) fractions are collected during the gradient, and protein 

10 concentration in each fraction determined. Purity and molecular mass of fractions are 
estimated by Tricine SDS-PAGE (as shown in Fig. 2), where Tricine is used as the trailing 
ion to allow better resolution of peptides in the range of 1 -1 000 kDa. Appropriate fractions 
are pooled and applied to a 1 .6 X 20 cm column of Sephadex G-25 (superfine) equilibrated 
in 5 mM ammonium acetate pH 6.8. The sample is collected based on UV absorbance and 

15 freeze-dried. The partially purified S-sulfonated material is resuspended in 50 mM 
glycine/NaOH, pH 10.5 at a final concentration of 2 mg/ml. j^mer-captoethanol is added 
at a ratio of 1 .5 mol per mol of cysteine S-sulfonate and the sample stirred at 4°C in an open 
container for 16 h. The sample is then analyzed by reversed-phase high-performance liquid 
chromatography (RP-HPLQ using a Vydac C 4 column (2.2 X 150 mm) equilibrated in 4% 

20 acetonitrile and 0.1% TFA. Adsorbed peptides are eluted with a linear gradient of increasing 
acetonitrite concentration (0.88% per min up to a maximum of 48%). The remaining 
refolded proinsulin are centrifuged at 16000 X g to remove insoluble material, and loaded 
onto a semi-preparative Vydad C 4 column (10 X 250 mm). The bound material is then 
eluted as described above, and the proinsulin collected and lyophilized. 

25 Analysis and characterization of insulin expressed in E. coli and Tobacco: The purified 
expressed proinsulin is subjected to matrix-assisted laser desorption/ionization-time of flight 
(MALDI-TCF) analysis (as described by Cowley and Mackin, 1997), using proinsulin from 
Eli Lilly as both an internal and external standard. A proteolytic digestion is performed 
using Staphylococcus aureus protease V8 to determine if the disulfide bridges have formed 

30 correctly naturally inside chloroplasts or by in vitro processing. Five jug of both the 
expressed proinsulin and Eli Lilly's proinsulin are lyophilized and resuspended in 50 }A of 
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250 mM NaP0 4 pH 7.8. Protease V8 is added at a ratio of 1:50 (w/w) in experimental 
samples and no enzyme added to the controls. All samples are then incubated overnight at 
37°C, the reactions stopped by freezing on dry ice, and samples stored at -20°C until 
analyzed. The samples are analyzed by RP-HPLC using a Vydac C 4 column (2.2X150 mm) 
5 equilibrated in 4% acetonitrite and 0. 1 % TFA. Bound material is then eluted using a linear 
. gradient of increasing acetonitrile concentration (0.88% per min up to a maximum of 48%). 

CTB-GM1 ganglioside binding assay: A GM 1 -ELISA assay is performed as described by 
Arakawa et al. (1997) to determine the affinity of plant-derived CTB for GM1 -ganglioside. 

10 The microtiter plate is coated with monosialogangliosice-GMl (Sigma G-7641) by 
incubating the plate with 100 Ml/well of GM1 (3.0 //g/ml) in bicarbonate buffer, pH 9.6 at 
4°C overnight. Alternatively, the wells are coated with 100 fxl/well of BSA (3.0 ^g/ml) as 
control. The plates are incubated with transformed plant total soluble protein and bacterial 
CTB (Sigma C-9903) in PBS (100 ^1/well) overnight at 4°C. The remainder of the 

15 procedure is then identical to the ELISA described above. 

Mouse feeding assays for CTB: This is performed as described by Haq et al. (1995). 
BALB/c mice, divided into groups of five animals each, are fasted overnight before feeding 
them transformed edible tobacco (that tastes like spinach) expressing CTB, untransformed 
edible tobacco and purified bacterial CTB. Feedings are performed at weekly intervals (0 5 

20 7, 14 days) for three weeks. Animals are observed to confirm complete consumption of 
material. On day 20, fecal and serum samples are collected from each animal for analysis 
of anti-CTB antibodies. Mice are bled retro-orbitally and the samples stored at -20°C until 
assayed Fecal samples are collected and frozen overnight at -70°C, lyophilized, 
resuspended in 0.8 ml PBS (pH7.2) containing 0.05% sodium azide per 15 fecal pellets, 

25 centrifuged at 1400xg for 5 min and the supernatant stored at -20°C until assayed. Samples 
are then serially diluted in PBS containing 0.05% Tween-20 (PBST) and assayed for anti- 
CTB IgG in serum and anti-CTB IgA in fecal pellets by the ELISA method, as described 
earlier. 



30 



Assessment of diabetic symptoms in NOD mice: The incidence of diabetic symptoms is 
compared among mice fed with control nicotine free edible tobacco and those that express 
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the CTB-proinsulin fusion protein. Four week old female NOD mice are divided into two 
groups, each group consisting of ten mice. Each group is fed with control or transgenic 
edible tobacco (nicotine free) expressing the CTB-proinsulin fusion gene. The feeding 
dosage is determined based on the level of expression. Starting at 1 0 weeks of age, the mice 
are monitore^i on a biweekly basis with urinary glucose test strips (Clinistix and Diastix, 
Bayer) for development of diabetes. Glycosuric mice are bled from the tail vein to check for 
glycemia using a glucose analyzer (Accu-Check, Boehringer Mannheim). Diabetes is 
confirmed by hyperglycemia (>250 mg/dl) for two consecutive weeks (Ma et al., 1 997). 
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EXPRESSIONOF THE NATIVE CHOLERA TOXIN B SUBUNIT GENE 
AS OLIGOMERS IN TRANSGENIC TOBACCO CHLOROPLASTS 

5 FIELD OF THE INVENTION 

This invention relates to expression of native cholera toxin B subunit gene as 
oligomers in transgenic plant chloroplasts, particularly, in transgenic tobacco chloroplasts. 

BACKGROUND 

10 Pharmacologically important proteins are increasingly being expressed in plants as 

an economical alternative to conventional protein production methods. Transgenic plants 
expressing recombinant proteins and biologically active peptides such as vaccines, growth 
factors, hormones, monoclonal antibodies and enzymes have been reported (1). 

Proteins from different sources of a wide range have been expressed in nuclear 

15 transgenic plants. Protein accumulation levels of recombinant enzymes, like phytase and 
xylanase were high in nuclear transgenic plants (14% and 4. 1% of total soluble tobacco leaf 
protein respectively). This maybe because their enzymatic nature made them more resistant 
to proteolytic degradation. Most nuclear transgenic plants express low levels of recombinant 
protein of human, viral or bacterial origin. Human proteins are expressed at levels ranging 

20 from as low as 0.000017% of fresh weight (human 0 interferon expressed in tobacco) to a 
high of 0. 1% of soluble seed protein (human enkephalin expressed in arabidopsis seeds, 2). 
The Norwalk virus capsid protein expressed in potatoes caused oral immunization when 
consumed, although food expression levels were low, maximizing at 0.3% of total soluble 
protein (3). 

25 

SUMMARY OF THE INVENTION 

This invention includes expression of native cholera toxin B subunit gene as 
oligomers in transgenic tobacco chloroplasts which may be utilized in connection with large- 
scale production of purified CTB, as well as an edible vaccine if expressed in an edible plant 
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or as a transmucosal carrier of peptides to which it is fused to either enhance mucosal 
immunity or to induce oral tolerance of the products of these peptides. 

BRIEF DESCRIPTION OF THE DRAWINGS 

5 Fig. 1. pLD-LH-CTB vector and PCR analysis of control and chloroplast 

transformants. A. The perpendicular dotted line shows the vector sequences that are 
homologous to native chloroplast DNA, resulting in homologous recombination and site 
specific integration of the gene cassette into the chloroplast genome. Primer landing sites 
are also shown. B. PCR analysis: 0.8% agarose gel of PCR products using total plant DNA 

10 as template. 1 kb ladder (lane 1); Untransformed plant (lane 2); PCR products with DNA 
template from transgenic lines 1-10 (Lanes 3 -12). 

Fig. 2. Western blot analysis of CTB expression in E.coli and chloroplasts. 
Blots were detected using rabbit anti-cholera serum as primary antibody and alkaline 
phosphatase labeled mouse anti-rabbit IgG as secondary antibody. A. E.coli protein analysis: 

15 Purified bacterial CTB, boiled (lane 1); Unboiled 24 h and 48 h transformed (lanes 2 & 4) 
and untransformed (lanes 3 & 5) E. coli cell extracts. Plant protein analysis: B. Color 
Development detection: Boiled, untransformed protein (lane 1); Boiled, purified CTB 
antigen (lane 2): Boiled, protein from 4 different transgenic lines (lanes, 3 - 6). C. 
Chemiluminescent detection: Plant protein- Untransformed, unboiled (lane 1); 

20 Untransformed, boiled (lane 2); Transgenic lines 3 & 7, boiled (lanes 3 & 5), Transgenic line 
3, unboiled (lane 4); Purified CTB antigen boiled (lane 6), unboiled (lane 7); Marker (lane 
8). 

Fig. 3. Southern blot analysis of T 0 and T x plants. A. Untransformed and 

transformed chloroplast genome: Transformed and untransformed plant DNA was digested 
25 with Bgin and hybridized with the 0.81 kb probe that contained the chloroplast flanking 
sequences used for homologous recombination. Southern Blot results of To lines (B) 
Untransformed plant DNA (lane 1); Transformed lines DNA (lanes 2-4) and T x lines (C) 
Transformed plant DNA (lanes 1-4) and Untransformed plant DNA (lane 5). 

Fig. 4. A. Plant phenotypes; 1: Confirmed transgenic line 7; 2: Untransformed 
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plant B. 10-day-old seedlings of Ti transfonned (1, 2 & 3) and untransformed plant (4) 
plated on 500mg/L spectinomycin selection medium. 

Fig. 5. A. CTB ELISA quantification: Absorbance of CTB-antibody complex 

in known concentrations of total soluble plantprotein was compared to absorbance of known 
5 concentration of bacteria] CTB-antibody complex and the amount of CTB was expressed as 
a percentage of the total soluble plant protein. Total soluble plant protein from young, 
mature and old leaves of transgenic lines 3 and 7 was quantified. B. CTBGM t Ganglioside 
binding ELISA assays: Plates coated first with gangliosides and BSA respectively, 
were plated with total soluble plant protein from lines 3 and 7, untransformed plant total 
1 0 soluble protein and purified bacterial CTB and the absorbance of the GMiganglioside-CTB- 
antibody complex in each case was measured. 

DETAILED DESCRIPTION 

Bacterial antigens like the B subunit proteins, CTB and LTB, which are two 
15 chemically, structurally and immunologically similar candidate vaccine antigens of 
prokaryotic enterotoxins, have been expressed in plants. CTB is a candidate oral subunit 
vaccine for cholera that causes acute watery diarrhoea by colonizing the small intestine and 
producing the enterotoxin, cholera toxin (CT). Cholera toxin is a hexameric AB 5 protein 
consisting of one toxic 27 kDa A subunit having ADP ribosyl transferase activity and a 
20 nontoxic pentamer of 1 1 .6 kDa B subunits (CTB) that binds to the A subunit and facilitates 
its entry into the intestinal epithelial cells. CTB when administered orally is a potent 
mucosal immunogen, which can neutralize the toxicity of the CT holotoxin by preventing 
it from binding to the intestinal cells (4). This is believed to be a result of it binding to 
eukaryotic cell surfaces via G M1 gangliosides, receptors present on the intestinal epithelial 
25 surface, eliciting a mucosal immune response to pathogens and enhancing the immune 
response when chemically coupled to other antigens (5,6). 

Native CTB and LTB genes have been expressed at low levels via the plant nucleus. 
Since, both CTB and LTB are AT-rich compared to plant nuclear genes, low expression was 
probably due to a number of factors such as aberrant mRNA splicing, mRNA instability or 
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. inefficient codon usage. To avoid these undesirable features synthetic "plant optimized" 
genes encoding LTB were created and expressed in potato, resulting in potato tubers 
expressing up to 10 - 20 jug of LTB per gram fresh weight (7). However, extensive codon 
modification of genes is laborious, expensive and often not available due to patent 
restrictions. One of the consequences of these constitutively expressed high LTB levels, was 
the stunted growth of transgenic plants that was eventually overcome by tissue specific 
expression in potato tubers. The maximum amount of CTB protein detected in auxin 
induced, nuclear transgenic potato leaf tissues was approximately 0.3% of the total soluble 
leaf protein when the native CTB gene was fused to an endoplasmic reticulum retention 
signal, thus targeting the protein to the endoplasmic reticulum for accumulation and 
assembly (8). 

Increased expression levels of several proteins have been attained by expressing 
foreign proteins in chloroplasts of higher plants (9 - 11). Human somatotropin has been 
expressed in chloroplasts with yields of 7% of the total soluble protein (12). The 
accumulation levels of the Bt Cry2Aa2 operon in tobacco chloroplasts are as high as 46.1 
% of the total soluble plant protein (1 3). This high level of expression is attributed to the 
putative chaperoning, orf 1 and orf 2, upstream of Cry2Aa2 in the operon that may help to 
fold the protein into a crystalline form that is stable and resistant to proteolytic degradation. 
Besides the ability to express polycistrons, yet another advantage of chloroplast 
transformation I, is the lack of recombinant protein expression in pollen of chloroplast 
transgenic plants. As there is no chloroplast DNA in pollen of most crops, pollen mediated 
outcross of recombinant genes into the environment is minimized (10 - 15). 

Since the transcriptional and translational machinery of plastids is prokaryotic in 
origin and the N. tabaccwn chloroplast genome has 62.2% AT content, it was likely that 
native CTB genes would be efficiently expressed in this organelle without the need for codon 
modification. Also, codon comparison of the CTB gene with psbA, the major translation 
product of the chloroplast, showed 47% homology with the most frequent codons of the 
psbA gene. Highly expressed plastid genes display a codon adaptation, which is defined as 
a bias towards a set of codons which are complimentary to abundant tRNAs (16). Codon 
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analysis showed that 34% of the codons of CTB are complimentary to the tRNA population 
in the chloroplasts in comparison with 5 1 % of psb A codons that are complimentary to the 
chloroplast tRNA population. 

Also, stable incorporation of the CTB gene into the precise location between the tmA 

5 and trnl genes of the chloroplast genome by homologous recombination, should eliminate 
the 'position effect' frequently observed in nuclear transgenic plants. This should allow 
uniform expression levels in different transgenic lines. Amplification of the 
transgene, should result in a high level of CTB gene expression since each plant cell contains 
up to 50,000 copies of the plastid genome (17). Another significant advantage of the 

10 production of CTB in chloroplasts, is the ability of chloroplasts to form disulfide bridges 
(12,18,19) which are necessary for the correct folding and assembly of the CTB pentamer 
(20). 

In this study, we report the integration of the CTB gene into the inverted repeat region 
of the tobacco chloroplast genome, allowing 2 copies / chloroplast genome of the CTB gene 
1 5 per cell, resulting in chloroplasts accumulating high levels of CTB. This eliminates the need 
to modify the CTB gene for optimal expression in plants. 

Construction of the Chloroplast Expression Vector pLD-CTB: The leader sequence (63 
bp) of the native CTB gene was deleted and a start codon was introduced at the 5 ! end. 
Primers were designed to introduce an rbs site 5 bases upstream of the start codon. The CTB 

20 PCRproduct was then cloned into the multiple cloning site of the pCR2. 1 vector (Invitrogen) 
and subsequently into the chloroplast expression vector pLD-CtV2 using suitable restriction 
sites. Restriction enzyme digestions of the pLD-LH- CTB vector were done to confirm the 
correct orientation of the inserted fragment. 

Expression of the pLD-LH- CTB vector was tested inE. coliXL-l Blue MRF TC strain 

25 before tobacco transformation. E. coli was transformed by standard CaCl 2 transformation 
procedures. Transformed E. coli (24 and 48 hrs culture in 100ml TB with 100 A*g/ml 
ampicillin) and untransforrned E. coli (24 and 48 hrs culture in 100 ml TB with 12.5 jug/ml 
tetracycline) were centrifuged for 15 min. The pellet obtained was washed with 200mM 
Tris-Cl twice, resuspended in 500 /A extraction buffer (200mM Tris-Cl, pH 
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8.0, iOOmM NaCi, lOmM EDTA, 2mM PMSF) and sonicated To aliquots of 100 fA 
transformed and untransformed sonicates [containing 50- 100 /zg of crude protein extract 
as determined by Bradford protein assay (Bio-rad)] and purified CTB (100 ng, Sigma), 2X 
SDS sample buffer was added. These sample mixtures were loaded on a 1 5% sodium SDS 
5 -PAGE gel and electrophoresed at 200v for 45 min. in Tris-glycine buffer (25mM Tris, 250 
mM glycine, pH 8.3, 0.1% SDS). The separated protein was transferred to a nitrocellulose 
membrane by electroblotting at 70v for 90 min. 

Immunoblot Analysis of CTB Production in E. coli: Nonspecific antibody reactions were 
blocked by incubation of the membrane in 25 ml of 5% non-fat dry milk in TBS buffer for 

10 2 h on a rotary shaker (40 rpm) followed by washing in TBS buffer for 5 min. The 
membrane was incubated for lh in 30 ml of a 1 :5000 dilution of rabbit anti-cholera antiserum 
(Sigma) in TBST (TBS with 0.05% Tween-20), containing 1% non-fat dry milk, followed 
by washing thrice in TBST. Incubation for an hour at room temperature in 30 ml of a 
1:10,000 dilution of alkaline phoshphatase conjugated mouse anti-rabbit IgG. (Sigma) in 

15 TBST, washing thrice in TBST and once with TBS was followed by incubation in the 
Alkaline Phoshphatase Color Development Reagents, BCIP/NBT in AP color development 
buffer (Bio-Rad) for an hour. 

Bombardment and Regeneration of Chloroplast Transgenic Plants: Fully expanded, dark 
green leaves of about two-month old Nicotiana tabacum var. Petit Havana plants were 

20 placed abaxial side up on filter papers in RMOP (21) petridish plates. Microprojectiles 
coated with pLD-LH-CTB DNA were bombarded into the leaves using the biolistic device 
PDSIOOO/He (Bio-Rad), as described by Daniell (21). Following incubation at 24 ° C in the 
dark for two days, the bombarded leaves were cut into small (-5mm 2 ) pieces and placed 
abaxial side up (5 pieces/plate) on selection medium (RMOP containing 500 tng/L 

25 spectinomycin dihydrochloride). Spectinomycin resistant shoots obtained after about 1-2 
months were cut into small pieces (—2mm 2 ) and placed on the same selection medium. 
PCR Analysis: Total plant DNA from putative transgenic and untransformed plants was 
isolated using the DNeasy kit (Qiagen), PCR primers 3P (5 f AAAACCCGTCCTCAGT 
TCGGATTGC-3') and 3M (5 -CCGCGTTGTTTCATCAAGCCTTACG-S') were used for 
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PCR on putative transgenic and untransformed plant total DNA. Samples were carried 
through 30 cycles using the following temperature sequence: 94 °C for 1 min, 62 °C for 
1.5minand72°Cfor2min. Cycles were preceded by denaturation for 5 min at 94° C. PCR 
confirmed shoots from the second selection were transferred to rooting medium (MSO 
5 medium containing 500 mg/L spectinomycin). 

Southern Blot Analysis: Ten micrograms of total plant DNA (isolated using DNeasy kit) 
per sample were digested with Bgin, separated on a 0.7% agarose gel and transferred to a 
nylon membrane. A 0. 8 kb fragment probe, homologous to the chloroplast border sequences, 
was generated when vector DNA was digested with Bglll and BamHI. Hybridization was 
10 performed using the Ready To Go protocol (Pharmacia). Southern blot confirmed plants 
were transferred to pots. On flowering, seeds obtained from T 0 lines were germinated on 
spectinomycin dihydrochloride-MSO media and T, seedlings were grown in bottles 
containing MSO with spectinomycin (500 mg/L) for 2 weeks. The plants were later 
transferred to pots. 

15 Western Blot Analysis of Plant Protein: Transformed and untransformed leaves (100 mg) 
were ground in liquid nitrogen and resuspended in 500 yul of extraction buffer (200mM Tris- 
Cl, pH8.0, 100 mM NaCl, lOmM EDTA, 2 mM PMSF). Leaf extracts (100 - 120 as 
determined by Lowry assay) were boiled (4 min) and unboiled in reducing sample buffer 
(BioRad) and electrophoresed in 12% polyacrylamide gels using the buffer system of 

20 Laemmli (22). The separated proteins were transferred to a nitrocellulose membrane by 
electroblotting at 85v for Ih. The immunoblot detection procedure was similar to that done 
for E. coli blots described above. For the chemiluminescent detection, the S. Tag™ AP 
Lumiblot kit (Novagen) was used. 

ELISA Quantification of CTB: Different concentrations (100 jul/well) of 100 mg leaves 
25 (transformed and untransformed plants) ground with liquid nitrogen and resuspended in 
bicarbonate buffer, pH 9.6 (15mM Na^Oa, 35mM NaHC0 3 ) were bound to a 96 well 
polyvinyl chloride microliter plate (Costar) overnight at 4°C. The background was blocked 
with 1% Bovine serum albumin (BSA) in 0.01M phosphate buffered saline (PBS) for 2h at 
37 °C, washed thrice with washing buffer, PBST (PBS and 0.05% Tween 20) and rabbit anti- 
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cholera serum diluted 1 :8,000 in PBST containing 0.5% BSA was added and incubated for 
2h at 37°C. The wells were washed and incubated with 1:50,000 mouse anti rabbit IgG- 
alkaline phosphatase conjugate in PBST containing 0.5% BSA for 2h at 37°C. The plate 
was developed with Sigma Fast pNPP substrate (Sigma) for 30 minutes at room temperature 
and the reaction was ended by addition of 3N NaOH and plates were read at 405 na 
GMj Ganglioside Binding Assay: To determine the affinity of chloroplast derived CTB for 
GMpgangliosides, microliter plates were coated with monosialoganglioside-GMx (Sigma) 
(3.0 ^g/ml in bicarb, buffer) and incubated at 4°C overnight As a control, BSA 
(3.0 >u:g/ml in bicarb, buffer) was coated on some wells. The wells were blocked with 1% . 
BSA in PBS for 2h at 37 °C, washed thrice with washing buffer, PBST and incubated with 
dilutions of transformed plant protein, untransformed plant protein and bacterial CTB in 
PBS. Incubation of plates with primary and secondary antibody dilutions and detection was 
similar to the CTB ELISA procedure described above. 

pLD-LH-CTB vector construction and E. coli expression: The pLD-LH-CTB vector 
integrates the genes of interest into the inverted repeat regions of the chloroplast genome 
between the trnl and trnA genes. Integration occurs through homologous recombination 
events between the trnl and trnA chloroplast border sequences of the vector and the 
corresponding homologous sequences of the chloroplast genome as shown in Fig. 1A. The 
chimeric aminoglycoside 3 1 adenyltransferase (aadA) gene that confers resistance to 
spectinomycin-streptomycin and the CTB gene downstream of it are driven by the 
constitutive promoter of the rRNA operon (Prrn) and transcription is terminated by the 
psbA3 f untranslated region. Since the protein synthetic machinery of chloroplasts is similar 
to that of E. coli (23), CTB expression of the pLD-LH-CTB vector in E. coli was tested. 
Western blot analysis of sonicated E. coli whole cell extract showed the presence of 1 1 kDa 
CTB monomers, similar to that obtained when purified commercially available CTB was 
treated in the same manner as shown in Fig. 2A. Oligomeric expression of CTB was not 
observed in E. coli, as expected, due to the absence of a leader peptide sequence present in 
the native CTB gene that directs the CTB monomer into the periplasmic space allowing for 
concentration and oligomeric assembly. 
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Selection and Regeneration of Transgenic Plants: Bombarded leaf pieces when placed 
on selection medium continued to grow but were bleached. Green shoots emerged from the 
part of the leaf in contact with the medium. Five rounds of bombardment (5 leaves each) 
resulted in 68 independent transformation events. Each such transgenic line was subjected 
5 to a second round of antibiotic selection. These putative transformants were subjected to 
PCR analysis to distinguish from nuclear transformants and mutants. 
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Determination of Chloroplast Integration and Homoplasmy: PCR and Southern 
hybridization were used to determine integration of the CTB gene into the chloroplast 
genome. Primers, 3P and 3M, designed to confirm incorporation of the gene cassette into 
the chloroplast genome were used to screen putative transgenics initially. The primer, 3P, 
5 landed on the chloroplast genome outside of the chloroplast flanking sequence used for 
homologous recombination as shown in Fig. 1 A. The primer, 3M, landed on the aadA gene. 
No PCR product should be obtained if foreign genes are integrated into the nuclear genome 
or in mutants lacking the aadA gene. The presence of the 1 .6kb PCR product in 9 of the 10 
putative transgenics screened, confirmed the site-specific integration of the gene cassette into 

10 the chloroplast genome. Database searches showed that no random priming took place as 
both the 3P and 3M primers showed no homology with other gene sequences. This is 
confirmed by the absence of PCR product in untransformed plants (Fig. IB). Similar 
strategy has been used successfully by us in order to confirm chloroplast integration of 
foreign genes (13,14,24,25). This screening is essential to eliminate mutants and nuclear 

1 5 transformants and saves space and labor of maintaining hundreds of transgenic lines. 

Southern blot analysis of three of the PCR positive transgenic lines was done to 
further confirm site specific integration and to establish copy number. In the chloroplast 
genome, BgUI sites flank the chloroplast border sequences 5' of 16S rRNA and3' of the trnA 
region as shown in Fig. 3 A. A 6.17kb fragment from a transformed plant and a 4.47 kb 

20 fragment from an untransformed plant were obtained when total plant DNA from 
transformed and untransformed plants was digested with BgllL The blot of the digested 
products was probed with a 32 P random primer-labeled 0.81 kb trnl-trnA fragment. The 
probe hybridized with the control giving a 4.47 kb fragment as expected, while for the 
transgenic lines a 6. 1 7 kb fragment was observed, indicating that all plastid genomes had the 

25 gene cassette inserted between the tail and trnA regions. The absence of a 4.47 kb fragment 
in transgenic lines indicates that homoplasmy has been achieved, to the detection level of 
a Southern blot. These results explain the high levels of CTB observed in transgenic tobacco 
plants. Southern blot confirmed plants transferred to pots were seen to have no adverse 
pleiotropic effects when compared to untransformed plants as shown in Fig.4A. Southern 
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blot analysis of T x plants in Fig. 3C shows that all 4 transgenic lines analyzed maintained 
homoplasmy. . 

Immunoblot Analysis of Chloroplast Synthesized CTB: Anti-cholera toxin antibodies did 
not show significant cross-reaction with tobacco plant protein as can be seen in Fig. 2 C, 
lanes 1 & 2. Boiled and unboiled leaf hornogenates were run on 12% SDS PAGE gels. 
Unboiled chloroplast synthesized CTB protein appeared as compact 45 kDa oligomers as 
shown in Fig. 2C 9 lane 4 similar to the unboiled, pentameric bacterial CTB which appeared 
to have partially dissociated into tetramers, trimers and monomers upon storage at 4 ° C over 
a period of several months from Fig. 2C, lane7. 
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While heat treatment (4 min. boiling) prior to SDS PAGE of pentameric bacterial 
CTB, gave CTB monomers predominantly, with some protein in the dimeric and trimeric 
form as shown in Fig. 2C, lane 6 3 chloroplast synthesized CTB dissociated into dimers and 
trimers only, when subjected to similar heat treatment as in Fig. 2C, lanes 3 & 5. These 
5 results are different from the heat induced dissociation of potato plant nucleus synthesized 
CTB; oligomers into monomers (8). A probable reason for this stability could be a more 
stable conformation of chloroplast synthesized CTB which maybe an added advantage in 
storage and administration of edible vaccines. Leaf homogenates from four different 
transgenic plants showed almost similar expression levels of CTB protein (see Fig. 2B). 
10 This suggests very little clonal variation of CTB expression, as was confirmed later by 
ELIS A quantification assays. Consistent expression levels of recombinant proteins in plants 
(as obtained for CTB in this research) may be essential for production of edible vaccines in 
plants. 

ELISA Quantification of CTB Expression: Comparison of the absorbance at 405nm of a 
15 known amount of bacterial CTB - antibody complex (linear standard curve) and that of a 
known concentration of transformed plant total soluble protein was used to estimate CTB 
expression levels. Optimal dilutions of total soluble protein from two transgenic lines were 
loaded in wells of the microliter plate. As reported previously (8), it was necessary to 
optimize the dilutions of total soluble protein, as levels of CTB protein detected varied with 
20 the concentration of total soluble protein, resulting in too high concentrations of total soluble 
protein inhibiting the CTB protein from binding to the wells of the plate. Both T 0 lines 
yielded CTB protein levels ranging between 3.5% to 4. 1 % of the total soluble protein (40 
/jrg of chloroplast synthesized CTB protein in 1 mg of total soluble protein) as shown in Fig. 
5 A. Also, estimation of CTB protein expression levels from different stages of leaves - 
25 young, mature and old determined that mature leaves have the highest levels of CTB protein 
expression. This is in accordance with the results obtained when similar experiments were 
performed when the Bt Cry2aA2 gene was expressed without the putative chaperonin genes, 
but contrary to results with the Bt Cry2aA2 operon, which showed high expression levels 
in older leaves, probably due to the stable crystalline structure (13). 
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GM 3 Ganglioside ELISA Binding Assays: Both chloroplast synthesized and bacterial CTB 
demonstrated a strong affinity for GM1, - gangliosides (see Fig. 5B) indicating that 
chloroplast synthesized CTB conserved the antigenic sites necessary for binding of the CTB 
pentamer to the pentasaccharide GMJ. The GM^ binding ability also suggests proper folding 
5 of CTB molecules resulting in the pentameric structure. Since oxidation of cysteine residues 
in the B subunits is a prerequisite for in vivo formation of CTB pentamers (20), proper 
folding is a further confirmation of the ability of chloroplasts to form disulfide bonds. 

High levels of expression of CTB in transgenic tobacco did not affect growth rates, 
flowering or seed setting as has been observed in this laboratory, unlike previously reported 
10 for the synthetic LTB gene, constitutively expressed via the nuclear genome (7). 
Transformed plant seedlings were green in color while untransformed seedlings lacking the 
aadA gene were bleached white as shown in Fig. 4B when germinated on antibiotic medium. 

The potential use of this technology is three-fold. While, it can be used for large scale 
1 5 production of purified CTB, it can also be used as an edible vaccine if expressed in an edible 
plant or as a transmucosal carrier of peptides to which it is fused to, so as to either enhance 
mucosal immunity or to induce oral tolerance to the products of these peptides (5). Large- 
scale production of purified CTB in bacteria involves the use of expensive fermentation 
techniques and stringent purification protocols (26) making this a prohibitively expensive 
20 technology for developing countries. The cost of producing 1kg of recombinant protein in 
transgenic crops has been estimated to be 50 times lower than the cost of producing the same 
amount by E. coli fermentation, assuming that recombinant protein is 20% of total E.coli 
protein (27). Thus, isolation and lysis of CTB producing chloroplasts from chloroplast 
transformed plants could serve as a cost-effective means of mass production of purified CTB, 
25 If used as an edible vaccine, a selection scheme eliminating the use of antibiotic resistant 
genes should be developed. One such scheme uses the betaine aldehyde dehydogenase 
(BADH) gene, which converts toxic betaine aldehyde to nontoxic glycine betaine, an 
osmoprotectant (28). Also, several other 
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strategies have been proposed to eliminate antibiotic-resistant genes from transgenic plants 
(29). 

Transgenic potato plants that synthesize a CTB-insulin fusion protein at levels of up 
to 0.1% of the total soluble tuber protein have been found to show a substantial reduction 
in pancreatic islet inflammation and a delay in the progression of clinical diabetes (30) . This 
may prove to be an effective clinical approach for prevention of spontaneous autoimmune 
diabetes. Since a increased CTB expression levels have been shown to be achievable via the 
chloroplast genome through this research, expression of a CTB-proinsulin fusion protein in 
the chloroplasts of edible tobacco (LAMD) is currently being tested in our laboratory. While 
existing expression levels of CTB via the chloroplast genome are adequate for commercial 
exploitation, levels can be increased further (about 10 fold) by insertion of a putative 
chaperonin, as in the case of the Bt Cry2aA2 operon, (13) which likely aids in the 
subsequent purification of recombinant CTB due to crystallization. 
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Figure 6: 12% reducing PAGE. Chemiluminescent 
detection with rabbit anti-cholera scrum (1°) and AP 
labeled mouse anti-rabbit IgG (2°) antibodies, 
Untransforrned, boiled(l) and unboiled (2); Transformed, 
boiled (3&5) and unboiled (4);Purif3ed CTB boiled (5)and 
unboiled (7); Marker (8). 

* HSA Nuclear transformation of potato plants . 



Figure7: A, B) reducing gels, lanarkers, 2:Transgenic 
extract showing expression of light (A) and heavy chain 
(B) in chloroplasts, 3; Untransforrned, 4: Human IgA. C) 
non-reducing gel. 1. Transgenic extract showing assembly, 
2: Untransforrned, 3: Human IgA. Blots A & C were 
detected with AP conjugated §oat anti-human kappa 
antibody. Blot B was detected with AP conjugated goat 
anti-human JgA antibody. 
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Figure 8: Western Blot of transgenic potato tubers, cv 
Desiree. 30 u,g of tuber protein was loaded per lane and 
probed with anti-HSA antibody. 1 : wild type; 2: 40 ng of 
pure HSA; 3-8:di'ferent trangenic fines, showing 
different levels of expression. 



* Expression of HSA by chloroplast 
vectors in E. colu 
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% HSAr in the total soluble protein' 
Figure 9: Frecuency histogram including percentage 
Kennebec and Desiree transgenic plants expressing 
different HSA levels. Results are shown as the 
percentages of transgenic plants (vertical axis) that 
express a specific level of HSA of the total soluble 
protein (horizontal axis). 

* Codon composition and expression levels. 



Figura 10: Western Blot of E. coli protein 



extracts. 1: 50 ng pure HSA; 2: molecular weigh 
marker; 3: pLD-HSA (control without RBS): 4: 
PLD- 5'UTR-HSA; 5: pU>RBS-HSA; 6: pLD- 
ORF1+2-HSA; 7: R coli without pLD vector. 
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Table 1: Unmodified native codon composition and expression 
levels observed in transgenic chloroplasts. See section d) for 
details of AT content, %psbA optimal codons and % of codons 
mat match the cp tRMA pool. TSP: % total soluble protein 
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Figure 3: Cry2A protein concentration determined by ELISA in 
transgenic leaves. "Note 100-fold increase in protein accumulation in 
the presence of the putative chaperomn, ORK2. 



Figure 2 ;InmunogoId labeled electron microscopy 
of mature transgenic leaf. Cry2Aa2 crystals in a 
transgenic chloroplast expressing the cry2A 
operon. 



1 Expression of a small (22sisl) peptide in transgenic chloroplasts, 

BioassayP. aeruginosa 




Transgenic Untransformed 

Figure 3 . Leaves were infected with 10 pi of 8x10 s , 
Sxl0*,5xl0 3 and 8X10 2 cells of P. sryringae. Photos 
were taken 5 days after inoculation. 1-2 u.g of 
antimicrobial peptide (AMP) is required to kill 1000 
bacterial cells. Local concentration at the site of 
infection is estimated to be 200-800ug AMP. 




Transgenic Wild typo Buffer only 

Figure 4. Total plant protein was mixed with* 5ul of mid- 
log phase bacteria from overnight culture, incubated for 2 
hours at25°C at 125rpm and grown in LB broth overnight 
Based on minimum inhibitory concentration of 1-2 jig 
AMP/1000 bacterial cells, the expression level was 
calculated to be 2 1 .5-43% of the total soluble protein. 



' Expression of Oligomeric form (disulfide bonded) CTB in transgenic chloroplasts. 
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Figure 5: A) CTB ELISA quatitl cation is shown as a percentajge of the total soluble plant protein. T otal s oluble 
plant protein from young, mature and old leaves of transgenic lines 3 and 7 was quantified. B) CTB-GM1 
Ganglioside binding ELISA assays: Plates coated first with GMI gangliosides and BSA were plated with total 
soluble plant protein from lines 3 and 7, untransformed plant total soluble protein and purified bacterial CTB. The 
absorbance or the GMI ganglioside-CTB antibody complex was measured. 
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PRODUCTION OF HUMAN SERUM ALBUMIN 
IN TRANSGENIC TOBACCO 

5 FIELD OF THE INVENTION 

This invention relates to production of high value pharmaceutical proteins in nuclear 
transgenic plants, particularly to production of human serum albumin in transgenic tobacco. 

BACKGROUND 

1 0 Human serum albumin (HS A) is a monomelic globular protein consisting of a single, generally 

nonglycosylated, polypeptide chain of 585 amino acids (66.5 KDa and 17 disulfide bonds) with no 
postradiational modifications. It is composed of three structurally similar globular domains and the 
disulfides are positioned in repeated series of nine loop-link-loop structures centered around eight 
sequential Cys-Cys pairs. HS A is initially synthesized as pre-pro-dbumin by the liver and released 

1 5 from the endoplasmatic reticulum after removal of the ammoterminal prepeptide of 1 8 amino acids. 
Thepro-albumin is furtherprocessed in the Golgi complex where the other 6 aminoterminal residues 
of the propeptide are cleaved by a serine proteinase (1). This results in the secretion of the mature 
polypeptide of 585 amino acids. H3A is encoded by two codominantautosomic allelic genes. HSA 
belongs to the multigene family of proteins that include alpha-fetoprotein and human group-specific 

20 component (Gc) or vitamin D-binding family. HSA facilitates transfer of many ligands across organ 
circulatory interfaces such as in the liver, intestine, kidney and brain. In addition to blood plasma, 
serum albumin is also found in tissues. HSA accounts for about 60% of tie total protein in blood 
seium. The concentration of albumin is 40 mg/ml in the serum of human adults. 

25 Medical applications of HSA: The primary function of HSA is tlie maintenance of colloid osmotic 
pressure (COP) within the blood vessels. Its abundance makes it an important determinant of the 
pharmacokinetic behavior of many drugs. Reduced synthesis of HSA can be due to advanced liver 
disease, impaired intestinal absorption of nutrients or poor nutritional intake. Increased albumin 
losses can be due to kidney diseases (increased glomerular permeability to macromolecules in the 

30 nephrotic syndrome), intestinal diseases (protein-losing enteropathies) or exudative skin disorders 
(burns). Catabolic states such as chronic infections, sepsis, surgery, intestinal resection, trauma or 
extensive bums can also cause hypoalbuminemia. HSA is used in therapy ofblood volume disorders, 
for example posthaemorrhagic acute hypovolemia or extensive burns, treatment of dehydration 
states, and also for cirrhotic and hepatic illnesses . It is also used as an additive in perfusion liquid for 

3 5 extracorporeal circulation. HSA is used clinically for replacing blood volume, but also has a variety 
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of non-therapeutic uses, including its role as a stabilizer in formulations for other therapeutic proteins. 
HS A is a stabilizer for biological materials in nature and is used for preparing biological standards and 
reference materials. Furthermore, HSA is frequently used as an experimental antigen, a cell-culture 
constituent and a standard in clinical-chemistry tests. 

5 

Expression systems for HSA: The expression and purification of recombinant HSA from various 
microorganisms has been reported previously (2-6). Saccharomyces cerevisiae has been used to 
produce HSA both intracellulary, requiring denaturation and refolding prior to analysis (7), and by 
secretion (8). Secreted HSA was equivalent structurally, but the recombinant product had lower 

1 0 levels of expression (recovery) and structural heterogeneity compared to the blood derived protein 
(9). HSA was also expressed in Kluyveromyces lactis, a yeast with good secretary properties 
achieving 1 g/liter in fed batch cultures (10). Ohtani et al (1 1) developed a HSA expression system 
usingPichiapastoris and established apurification method obtaining recombinant protein with similar 
levels of purity and properties as the human protein. In Bacillus subtilis, HSA could be secreted 

1 5 using bacterial signal peptides (4). HSA production in E, coli was successful but required additional 
in vitro processing with trypsin to yield the mature protein (3). Sijmons et al. (12) expressed HSA 
in transgenic potato and tobacco plants. Fusion of HSA to the plant PR-S presequence resulted in 
cleavage of the presequence at its natural site and secretion of correctly processed HSA, that was 
indistinguishable from the authentic human protein. The expression was 0.014% of the total soluble 

20 protein. However, none of these methods have been exploited commercially. 

Challenges in commercial production of HSA: Albumin is currently obtained by protein 
fractionation from plasma and is the world's most used intravenous protein, estimated at around 500 
metric tons per year. Albumin is typically administered by intravenous injection of solutions 

25 containing 20% of albumin. The average dosage of albumin for each patient varies between 20-40 
grams/day. The consumption of albumin is around 700 kilograms per million habitants per year. In 
addition to high cost, HSA has the risk of transmitting diseases as with other blood-derivative 
products. The price of albumin is about $3. 7/g. Thus, the market of this protein approximately 
amounts to 0.7 billion dollars per year in USA. Because of the high cost of albumin, synthetic 

30 macromolecules (like dextrans) are used to increase plasma colloidosmotic pressure. 

Commercial HSA is mainly prepared from human plasma. This source hardly meets the 
requirements of the world market. The availability of human plasma is limited and careful heat 
treatment of the product prepared must be performed to avoid potential contamination of the product 
by hepatitis, HIV and other viruses. The costs of HSA extraction from blood are very high. 

35 Innovative production systems are needed to meet the demands of the large albumin market with a 
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safe product at a low cost. Plant biotechnology offers the promise of obtaining safe and cheap 
proteins to be used to treat human diseases. 

Chloroplast genetic engineering: When we developed the concept of chloroplast genetic 
5 engineering (13,14), it was possible to introduce isolated intact chloroplasts into protoplasts and 
regenerate transgenic plants (15). Therefore, early investigations on chloroplast transformation 
focused on the development of in organello systems using intact chloroplasts capable of efficient and 
prolonged transcription and translation (16 - 18) and expression of foreign genes in isolated 
chloroplasts (19). However, after the discovery of the gene gun as a transformation device (20), it 

10 was possible to transform plant chloroplasts without the use of isolated plastids and protoplasts. 
Chloroplast genetic engineering was accomplished in several phases. Transient expression of foreign 
genes in plastids of dicots (21,22) was followed by such studies in monocots (23). Unique to the 
chloroplast genetic engineering is the development of a foreign gene expression system using 
autonomously replicating chloroplast expression vectors (21). Stable integration of a selectable 

15 marker gene into the tobacco chloroplast genome (24) was also accomplished using the gene gun. 
However, useful genes conferring valuable traits via chloroplast genetic engineering have been 
demonstrated only recently. For example, plants resistant to B.t. sensitive insects were obtained by 
integrating the crylAc gene into the tobacco chloroplast genome (25). Plants resistant to B,t 
resistant insects (up to 40,000 fold) were obtained by hyper-expression of the cry2A gene within the 

20 tobacco chloroplast genome (26). Plants have also been genetically engineered via the chloroplast . 
genome to confer herbicide resistance and the introduced foreign genes were maternally inherited, . 
overcoming the problem of out-cross with weeds (27). Chloroplast genetic engineering technology 
is currently being applied to other useful crops (14,28). 

25 Investigations In progress: A remarkable feature of chloroplast genetic engineering is the 
observation of exceptionally large accumulation of foreign proteins in transgenic plants, as much as 
46% of CRY protein in total soluble protein, even in bleached old leaves (29, see attached report De 
Cosa et al. 2001). Stable expression of a pharmaceutical protein in chloroplasts .was first reported 
for GVGVP, a protein based polymer with varied medical applications (such as the prevention of 

30 post-surgical adhesions and scars, wound coverings, artificial pericardia, tissue reconstruction and 
programmed drug delivery (30)). Subsequently, expression of the human somatotropin via the 
tobacco chloroplast genome (31) to high levels (7% of total soluble protein) was observed. The 
following investigations that are in progress in our laboratory illustrate the power of this technology 
to express small peptides, entire operons, vaccines that require oligomeric proteins with stable 

35 disulfide bridges and monoclonals that require assembly of heavy/light chains via chaperonins. 
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Engineering novel pathways via the chloroplast genome: In plant and animal cells, nuclear 
mRNAs are translated monocistronically. This poses a serious problem when engineering multiple 
genes in plants (32). Therefore, to express thepolyhydroxybutyrate polymer or Guy's 13 antibody, 
single genes were first introduced into individual transgenic plants. Then, these plants were back- 
5 crossed to reconstitute the entire pathway or the complete protein (33,34). Similarly, in a seven year 
long effort, Ye et al. (22) recently introduced a set of three genes for a short biosynthetic pathway 
that resulted in P-carotene expression in rice. In contrast, most chloroplast genes of higher plants are 
cotranscribed (32). Expression of polycistrons via the chloroplast genome provides a unique 
opportunity to express entire pathways in a single transformation event We have recently used the 

1 0 Bacillus thuringiensis (B t) cry! Aa2 operon as a model system to demonstrate operon expression and 
crystal formation via the chloroplast genome (29). Cry2Aa2 is the distal gene of a three-gene operon. 
The or/immediately upstream of cry! Aa2 codes for a putative chaperonin that facilitates the folding 
of cry2As2 (and other proteins) to form proteolytically stable cuboidal crystals (35). 

Therefore, the cry2As2 bacterial operon was expressed in tobacco chloroplasts to test the 

15 resultant transgenic plants for increased expression and improved persistence of the accumulated 
insecticidal protein(s). Stable foreign gene integration was confirmed by PGR and Southern blot 
analysis in T 0 and T x transgenic plants. Cry2Aa2 operon derived protein accumulated at 45.3% of 
the total soluble protein in mature leaves and remained stable even in oldbleached leaves (46 . 1 %)(see 
figure number 4 in attached article De Cosa et al. 200 1 , 29). This is the highest level of foreign gene 

20 expression ever reported in transgenic plants. Exceedingly difficult to control insects (1 0-day old 
cotton bollwonn, beetarmy worm) were killed 1 00% after consuming transgenic leaves. Electron 
micrographs showed the presence of the insecticidal protein folded into cuboidal crystals similar in 
shape to Cry2Aa2 crystals observed in Bacillus thuringiensis (see figure number 6 in attached article 
De Cosaetal. 2001,29). 

25 In contrast to currently marketed transgenic plants with soluble CRY proteins, folded 

protoxin crystals are processed only by target insects that have alkaline gut pH. This approach should 
improve safety of Bt transgenic plants. Absence of insecticidal proteins in transgenic pollen 
eliminates toxicity to non-target insects via pollen. In addition to these environmentally friendly 
approaches, this observation should serve as a model system for large-scale production of foreign 

30 proteins within chloroplasts in a folded configuration enhancing their stability and facilitating single 
step purification. This is the first demonstration of expression of a bacterial operon in transgenic 
plants and opens the door to engineer novel pathways in plants in a single transformation event. 



35 



Expressing small peptides via the chloroplast genome: It is common knowledge that the medical 
community has been fighting a vigorous battle against drug resistant pathogenic bacteria for years. 
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Cationic antibacterial peptides from mammals, amphibians and insects have gained more attention 
over the last decade (36). Key features of these cationic peptides are a net positive charge, an affinity 
for negatively-charged prokaryotic membrane phospholipids over neutral-charged eukaryotic 
membranes and the ability to form aggregates that disrupt the bacterial membrane (37). 

There are three major peptides with a-helical structures, cecropin fromHyalophora cecropia 
(giant silk moth), magainins from Xenopus laevis (African frog) and defensins from mammalian 
neutrophils. Magainin and its analogues have been studied as a broad-spectrum topical agent, a 
systemic antibiotic; a wound-healing stimulant; and an anticancer agent (38). We have recently 
observed that a synthetic lytic peptide (MSI-99, 22 amino acids) can be successfully expressed in 
tobacco chloroplast (39). The peptide retained its lytic activity against the phytopathogenic bacteria 
Pseudomonas syringae and multidrug resistant human pathogen, Pseudomonas aeruginosa. The 
anti-microbial peptide (AMP) used in this study was an amphipathic alpha-helix molecule that has an 
affinity for negatively charged phospholipids commonly found in the outer-membrane of bacteria. 

Upon contact with these membranes, individual peptides aggregate to form pores in the 
membrane, resulting in bacterial lysis. Because of the concentration dependent action of the AMP, 
it was expressed via the chloroplast genome to accomplish high dose delivery at the point of infection. 
PCR products and Southern blots confirmed chloroplast integration of the foreign genes and 
homoplasmy. Growth and development of the transgenic plants was unaffected by hyper-expression 
of the AMP within chloroplasts. In vitro assays with T 0 and T, plants confirmed that the AMP was 
expressed at high levels (21.5 to 43% of the total soluble protein) and retained biological activity 
against Pseudomonas syringae, a major plant pathogen. In situ assays resulted in intense areas of 
necrosis around the point of infection in control leaves, while transformed leaves showed no signs 
of necrosis (200-800 ^ig of AMP at the site of infection) as shown in Fig. 1 . T 1 in vitro assays against 
Pseudomonas aeruginosa (a multi-drug resistant human pathogen) displayed a 96% inhibition of 
growth as shown in Fig. 2. These results give a new option in the battle against phytopathogenic and 
drug-resistant human pathogenic bacteria. Small peptides (like insulin) are degraded in most 
organisms. However, stability of this AMP in chloroplasts opens up this compartment for expression 
of hormones and other small peptides. 

Expression of cholera toxin p subunit oligomers as a vaccine in chloroplasts: Vibrio cholerae, 
which causes acute watery diarrhea by colonizing the small intestine and producing the enterotoxin, 
cholera toxin (CT). Cholera toxin is a hexameric AB 5 protein consisting of one toxic 27kDa A 
subunit having ADP ribosyl transferase activity and a nontoxic pentamer of 1 1.6 kDa B subunits 
(CTB) that binds to the A subunit and facilitates its entry into the intestinal epithelial cells. CTB 
when administered orally (40) is a potent mucosal immunogen which can neutralize the toxicity of 
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the CT holotoxin by preventing it from binding to the intestinal cells (41). This is believed to be a 
result of it binding to eukaryotic cell surfaces via the G m gangliosides, receptors present on the 
intestinal epithelial surface, thus eliciting a mucosal immune response to pathogens (42) and 
enhancing the immune response when chemically coupled to other antigens (43 - 46). 
5 Cholera toxin (CTB) has previously been expressed in nuclear transgenic plants at levels of 

0.01 (leaves) to 0.3% (tubers) of the total soluble protein. To increase expression levels, we 
engineered the chloroplast genome to express the CTB gene (47). We observed expression of 
oligomeric CTB at levels of 4-5% of total soluble plant protein as shown in Fig. 3 A. PCR and 
Southern Blot analyses confirmed stable integration of the CTB gene into the chloroplast genome. 

1 0 Western blot analysis showed that transgenic chloroplast expressed CTB was antigenically identical 
to commercially available purified CTB antigen as shown in Fig. 4. Also, G M1 -ganglioside binding 
assays confirm that chloroplast synthesized CTB binds to the intestinal membrane receptor of cholera 
toxin as shown in Fig. 3B. Transgenic tobacco plants were morphologically indistinguishable from 
untrans formed plants and the introduced gene was found to be stably inherited in the subsequent 

15 generation as confirmed by PCR and Southern Blot analyses. The increased production of an 
efficient transmucosal carrier molecule and delivery system, like CTB, in chloroplasts of plants makes 
plant based oral vaccines and fusion proteins with CTB needing oral administration, a much more 
feasible approach. This also establishes unequivocally that chloroplasts are capable of forming 
disulfide bridges to assemble foreign proteins. 

20 

Expression and assembly of monoclonals in transgenic chloroplasts: Dental caries (cavities) is 
probably the most prevalent disease of humankind. Colonization of teeth by S. mutans is the single 
most important risk factor in the development of dental caries. S. mutans is a non-motile, gram 
positive coccus. It colonizes tooth surfaces and synthesizes glucans (insoluble polysaccharide) and 

25 fructans from sucrose using the enzymes glucosyltransferase and fructosyltransferase respectively 
(48). The glucans play an important role by allowing the bacterium to adhere to the smooth tooth 
surfaces. After its adherence, the bacterium ferments sucrose and produces lactic acid. Lactic acid 
dissolves the minerals of the tooth, producing a cavity. 

A topical monoclonal antibody therapy to prevent adherence of 5. mutans to teeth has 

30 recently been developed. The incidence of cariogenic bacteria (in humans and animals) and dental 
caries (in animals) was dramatically reduced for periods of up to two years after the cessation of the 
antibody therapy. No adverse events were detected either in the exposed animals or in human 
volunteers (49). The annual requirement for this antibody in the US alone may eventually exceed 1 
metric ton. Therefore, this antibody was expressed via the chloroplast genome to achieve higher 

35 levels of expression and proper folding (50). The integration of antibody genes into the chloroplast 
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genome was confirmedbyPCRand Southern blot analysis. The expression ofboth heavy and light 
chains was confirmed by western blot analysis under reducing conditions as shown in Figs. 5A and 
B. The expression of fully assembled antibody was confirmed by western blot analysis under non- 
reducing conditions as shown in Fig. SC. This is the first report of successful assembly of a multi- 
subunithumanproteinin transgenic chloroplasts. Produc^onofmonoclonalantibomesatagriciUtural 
level should reduce their cost and create new applications of monoclonal antibodies. 

Significance: Medical molecular pharming in transgenic plants has been reviewed recently (51). 
Since the demand for cheap and safe sources of HSA is expected to increase considerably in the 
1 0 corning years, it wouldbe wise to ensure that in the future this protein will be available in significantly 
larger amounts, preferably on a cost-effective basis. Because most genes can be expressed in many 
different systems, it is essential to determine which system offers the most advantages for the 
manufacture of the recombinant protein. The ideal expression system is one that produces a 
maximum amount of safe, biologically active material at a minimum cost. The use of modified 
1 5 mammalian cells with recombinant DNA techniques has the advantage of resulting inproducts which 
are closely related to those of natural origin. However, culturing these cells is intricate and can only 
be carried out on limited scale. 

The use of microorganisms such as bacteria permits manufacture on a larger scale, but 
introduces the disadvantage of producing products, which differ appreciably from the products of 
20 natural origin. For example, proteins that are usually glycosylated inhumans are not glycosylated by 
bacteria. Furthermore, human proteins that are expressed at high levels in E. coli frequently acquire 
an unnatural conformation, accompanied by intracellular precipitation due to lack of proper folding 
and disulfide bridges. Production of recombinant proteins in plants has many potential advantages 
forgeneiatmgbiopharmaceuticalsrelevantto clinical medicine. These include the following: (i)plant 
25 systems are more economical than industrial facilities using fermentation systems; (ii) technology is 
available for harvesting andprocessingplants/plantproducts on a large scale; (m) emnination of the 
purification requirement when the plant tissue containing the recombinant protein is used as a food 
(edible vaccines); (iv) plants can be directed to target proteins into stable, intracellular compartments 
as chloroplasts, or expressed directly in chloroplasts; (v) the amount of recombinantproduct that can 
30 be produced approaches industrial-scale levels; and (vi) health risks due to contamination with 
potential human pathogens/toxins are minimiz ed. 

It has been estimated that one tobacco plant should be able to produce more recombinant 
protein than a 300-liter fermenter ofE. coli. In addition, a tobacco plant produces a million seeds, 
facilitating large-scale production. Tobacco is also an ideal choice because of its relative ease of 
35 genetic manipulation and an impending need to explore alternate uses for this hazardous crop. 
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However, with the exception of enzymes (e.g. phytase), levels of foreign proteins produced in nuclear 
transgenic plants are generally low, mostly less than 1% of the total soluble protein (52). May et aL 
(53) discuss this problem using the following examples. Although plant derived recombinant hepatitis 
B surface antigen was as effective as a commercial recombinant vaccine, the levels of expression in 
transgenic tobacco were low (0.0066% of total soluble protein). Even though Norwalk virus capsid 
protein expressed in potatoes caused oral immunization when consumed as food (edible vaccine), 
expression levels were low (0.3% of total soluble protein). In particular, expression of human 
proteins in nuclear transgenic plants has been disappointingly low: e.g. human Interferon-p 
0.000017% of fresh weight, human seramalbumin0.02%andeiythropoietin 0.0026% oftotal soluble 
protein (see Tablel in ref. 52). A synthetic gene coding for the human epidermal growth factor was 
expressed only up to 0.001% oftotal soluble protein in transgenic tobacco (53). 

The cost of producing recombinant proteins in alfalfa leaves was estimated to be 12-fold 
lower than in potato tubers and comparable with seeds (52). However, tobacco leaves are much 
larger and have much higher biomass than alfalfa. The cost of production of recombinant proteins 
will be 50-fold lower than that of E.coli fermentation (with 20% expression levels, 52). A decrease 
in insulin expression from 20% to 5% of biomass doubled the cost of production (54). Expression 
level less than 1% oftotal soluble protein in plants has been found to be not commercially feasible 
(52). Therefore, it is important to increase levels of expression of recombinant proteins in plants to 
exploit plant production of pharmacologically important proteins. 

An alternate approach is to express foreign proteins in chloroplasts of higher plants . We have 
recently integrated foreign genes (up to 1 0,000 copies per cell) into the tobacco chloroplast genome 
resulting in accumulation of recombinant proteins up to 46% of the total cellular protein (29). 
Chloroplast transformation utilizes two flanking sequences that, through homologous recombination, 
insert foreign DNA into the spacer region between the functional genes of the chloroplast genome, 
thus targeting the foreign genes to a precise location. This eliminates the "position effect" and gene . 
silencing frequently observed in nuclear transgenic plants. Chloroplast genetic engineering is an 
environmentally friendly approach, minimizing concerns of out-cross of introduced traits via pollen 
to weeds or other crops. Most importantly, a significant advantage in the production of 
pharmaceutical proteins in chloroplasts is their ability to process eukaiyotic proteins, including folding 
and formation of disulfide bridges (55). Chaperonin proteins are present in chloroplasts (56,57) that 
function in folding and assembly of prokaryotic/eukaryotic proteins. Also, proteins are activated by 
disulfide bond oxido/reduction cycles using the chloroplast thioredoxin system (58) or chloroplast 
protein disulfide isomerase (59). Accumulation of fully assembled, disulfide bonded form of human 
somatotropin via chloroplast transformation (31) and oligomeric form of CTB (47) and assembly of 
heavyandlight chains ofhumanized Guy's 13 antibody in transgenic chloroplasts (50) provide strong 
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evidence for successful processing of pharmaceutical proteins inside chloroplasts. Such folding and 
assembly should eliminate the need for highly expensive in vitro processing of pharmaceutical 
proteins. For example, 60% of the total operating cost in the production of human insulin is 
associated with in vitro processing (formation of disufide bridges and cleavage of methionine)(54). 

5 

Taken together, low levels of expression of human proteins in nuclear transgenic plants, and 
difficulty in folding, assembly/processing of human proteins in Rcolz should make chloroplasts an 
ideal compartment for expression of recombinant proteins. Production of human proteins in 
transgenic chloroplasts also dramatically lowers the production cost. Large-scale production of 
10 human serum albumin in plants is a powerful approach to provide safe treatment to patients at an 
affordable cost and provide tobacco farmers alternate uses for this hazardous crop. Therefore, it 
would be highly advantageous to provide for expression of human serum albumin in transgenic 
tobacco chloroplasts to increase levels of expression and accomplish in vivo processing. 

15 SUMMARY OF THE INVENTION 

This invention synthesizes high value pharmaceutical proteins in nuclear transgenic plants by 
chloroplast expression for pharmaceutical protein production. Chloroplasts are suitable for this 
purpose because of their ability to process eukaryotic proteins, including folding and formation of 
disulfide bridges, thereby eliminating the need for expensive post-purification processing. Tobacco 

20 is an ideal choice for this purpose because of its large biomass, ease of scale-up (million seeds per 
plant) and genetic manipulation. We use poly(GVGVP) as a fusion protein to enable hyper- 
expression of human serum albumin and accomplish rapid one step purification of fusion peptides 
utilizing the inverse temperature transition properties of this polymer. We also use human serum 
albumin-CTB fusion protein in chloroplasts of nicotine free edible tobacco (LAMD 605) for oral 

25 delivery to NOD mice. 

BRIEF DESCRIPTION OF DRAWINGS 
Fig. 1 shows apair of photographs of leaves infected with 10 jA of 8x10 s , 8x10*, 8xl0 3 and 
8x1 0 2 cells of P. syringae, taken 5 days after inoculation. 
30 Fig. 2 is a graph of absorbance of 600 nm of total plant protein mixed with 5 pd of mid-log 

phase bacteria from overnight culture, incubated for two hours at 25 °C at 1 25 rpm and grown in LB 
broth overnight 

Fig. 3A is a graph of CTB ELISA quantification shown as percentage of the total soluble 
plant protein. 

35 Fig. 3B is a graph of CTB-GM1 ganglioside binding ELISA assay of plates coated first with 
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GM1 gangliosides and BSA were plated with total soluble plant protein. 

Fig. 4 is a photograph of a 12% reducing PAGE of expression of CTB oligomers. 

Figs. 5A and B show photographs of reducing gels of expression and assembly of disulfide 
bonded Guy's 13 monoclonal antibody. 
5 Fig. 5C is a photograph of a non-reducing gel. 

Fig. 6 is a photograph of a Western Blot of expression of HS A via nuclear genome in potato. 

Fig. 7 is a pair of frequency histograms including percentage Kennebec and Desiree transgenic 
plants expressing different HSA levels. 
1 0 Fig. 8 is a photograph of a Western Blot of expression of HSA by chloroplat vectors in E. . 

coli. 

Fig. 9 is a photograph of a Western Blot of expression of HSA via chloroplast genome in 
tobacco. 

Fig. 1 OA is a map of the pLD chloroplast transformation vector and primer landing sites. 
1 5 Fig. 1 OB is a photograph of an Agarose gel containing PCR products using total plant DNA 

as template from transformed plants. 

DETAILED DESCRIPTION 
Expression of HSA via nuclear genome in potato: Recently, our collaborators in Spain cloned the 

20 human HSA cDNA from human liver cells and fused the patatin promoter (whose expression is tuber 
specific (60)) along with the leader sequence of PIN II (proteinase II inhibitor potato transit peptide 
that directs HSA to the apoplast (61)). Leaf discs of Desiree and Kennebec potato plants were 
transformed using Agrobacterium tumefaciens. A total of 98 transgenic Desiree clones and 30 
Kennebec clones were tested by PCR and western blots. Western blots showed that the recombinant 

25 albumin (rHSA) had been properly cleaved by the proteinase II inhibitor transit peptide in Fig. 6. 
Expression levels of both cultivars were very different among all transgenic clones as expected as 
shown in Fig. 7, probably because of position effects and gene silencing (62,63). The population 
distribution was similar in both cultivars: the majority of transgenic clones showed expression levels 
between 0.04 and 0.06% of rHSA in the total soluble protein. The maximum recombinant HSA 

30 amount expressed was 0.2%. Between one and five T-DNA insertions per tetraploid genome were 
observed in these clones. Plants with higher protein expression were always clones with several 
copies ofthe HSA gene. Levels ofmRNA were analyzed by Northern blots. There was a correlation 
between transcript levels and recombinant albumin accumulation in transgenic tubers. TheN-terminal 
sequence showed proper cleavage of the transit peptide and the amino terminal sequence between 

35 recombinant and human HSA was identical. Inhibition of patatin expression using the antisense 
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technology did not improve the amount of rHSA. Average expression level among 29 anti sense 
transgenic plants was 0.032% of total soluble protein, with a maximum expression of 0.1%. The 
maximum HSA expression level observed was 5-10 times more than that reported by Sijmons et al. 
(12). However, higher levels are needed to make plant derived HSA production commercially 
5 feasible. 

Chloroplast expression of HSA: We have also initiated transformation of the tobacco chloroplast 
genome for hyperexpression of HSA, which is a new technology that has reported the highest 
expression levels in plants (29). The HSA codon composition is advantageous for chloroplast 
1 0 expression and no changes in the nucleotide sequence were needed- pLD vector was used for all the 
constructs. We designed several vectors to optimize HSA expression. All these contain ATG as the 
first amino acid of the mature protein. 

1- RBS-ATG-HSA : The first vector includes the gene that codes for the mature HSA plus an 
additional ATG as a translation initiation codon. We included tbe ATG in one of the primers of the 

15 PCR, 5 nucleotides downstream of the chloroplast preferred RBS sequence GGAGG. The cDNA 
sequence of the mature HSA was used as template. The PCR product was cloned into PCR 2.1 
vector, excised as an EcoRI-NotI fragment and introduced into the pLD vector. 

2- SUTRpsbA-ATG-HSA : The 200 bp tobacco chloroplast DNA fragment containing the 5 ' psbA 
UTR (untranslated region) was amplified using PCR and tobacco DNA as template. The fragment 

20 was cloned into PCR 2. 1 vector, excised EcoRI-Ncol fragment was inserted at the Ncol site of the 
ATG-HSA and finally inserted into the pLD vector as an EcoRI-NotI fragment downstream of the 
16S rRNA promoter to enhance translation of the protein. 

3- BtORFl+2-ATG-HSA : ORF1 and ORF2 of the Bt Cry2Aa2 operon were amplified in a PCR 
using the complete operon as a template. The fragment was cloned into PCR 2. 1 vector, excised as 

25 an EcoRI-EcoRV fragment, inserted at EcoRV site with the ATG-HSA sequence and introduced into 
the pLD vector as an EcoRI-NotI fragment. The ORF1 and ORF2 were fused upstream of the ATG- 
HSA. 

4- BtORFl+2-5TJTRpsbA -ATG-HSA : The 5TJTRpsbA was introduced in the vector number 3 
upstream of the HSA in the EcoRV-Ncol site. 

30 Expression of chloroplast vectors was first tested in E.coli before their use in tobacco 

transformation because of the similarity of protein synthetic machinery (64). Different levels of. 
expression were obtained in E. coli depending on the construct as shown in Fig. 8. Using the psbA 
5 1 UTR and the ORF 1 and ORF2 of the cry2Aa2 operon, we obtained higher levels of expression than 
using only the RBS. We observed in previous experiments that HSA in E. coli is completely 

35 insoluble, probably due to an improper folding resulting from the absence of disulfide bonds. This 
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is the reason why the protein is precipitated in the gel as shown in Fig. 8 . Different polypeptide sizes 
were observed, probably due to incomplete translation. Assuming that E. coli and chloroplast have 
similar protein synthesis machinery, one could expect different levels of expression in transgenic 
tobacco chloroplasts depending on the regulatory sequences, with the advantage that disulfide bonds 
5 are formed in chloroplasts (3 1). These four vectors were bombarded into tobacco leaves via particle 
bombardment (65) and after 4 weeks small shoots appeared as a result of independent transformation 
events . They all were tested by PCR to check integration in the chloroplast genome as shown in Figs . 
10A and B. The positive clones were transferred to pots. Transgenic leaves analyzed by western 
blots showed different levels of expression depending on the 5 f region used in the transfromation 
1 0 vector. Maximum levels were observed in the plants transformed with the HS A preceded by the 5 f 
UTR of the psb A gene as shown in Fig. 9. Quantification of the HS A and molecular analysis of these 
transformants are in progress. 

1) Evaluation of chloroplast gene expression: A systematic approach to identify and 

15 overcome potential limitations of foreign gene expression in chloroplasts of transgenic plants is 
essential. Information gained in this study should increase the utility of chloroplast transformation 
system by scientists interested in expressing other foreign proteins. Therefore, it is important to 
systematically analyze transcription, RNA abundance, RNA stability, rate of protein synthesis and 
degradation, proper folding and activity. For example, the rate of transcription of the introduced 

20 HS A gene will be compared with the highly expressing endogenous chloroplast genes (rbcL, psb A, 
1 6S rRNA), using run on transcription assays to determine if the 1 6SrRNA promoter is operating as 
expected. Transgenic chloroplast containing each of the constructs with different 5 1 regions (see 
preliminary studies) will be investigated to test their transcription efficiency. Similarly, transgene 
RNA levels will be monitored by northerns, dot blots and primer extension relative to endogenous 

25 rbcL, 16S rRNA or psb A. These results, along with run on transcription assays, should provide 
valuable infoimation of RNA stability, processing, etc. With our past experience in expression of 
several foreign genes, RNA appears to be extremely stable based on northern blot analysis. However, 
a systematic study would be valuable to advance utility of this system by other scientists. Most 
importantly, the efficiency of translation will be tested in isolated chloroplasts and compared with the 

30 highly translated chloroplast protein (psbA). Pulse chase experiments will help assess if translational 
pausing, premature termination occurs. Evaluation of percent RNA loaded on polysomes or in 
constructs with or without 5UTRs will help determine the efficiency of the ribosome binding site and 
5' stem-loop translational enhancers. In our recent experience, we observed a 200-fold difference in 
accumulation of foreign proteins due to decreases in proteolysis conferred by a putative chaperonin 

35 (29) . Therefore, proteins from constructs expressing or not expressing the putative chaperonin (with 
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or without ORF1+2) should provide valuable information on protein stability. Thus, this information 
will be used to improve the next generation of chloroplast vectors. 

2) Expression of the mature protein: HSA is a pre-protein that is cleaved in the N-tenninal 
5 to secrete the mature protein. The codon for translation initiation is in the presequence. In 
chloroplasts, the necessity of expressing the mature protein introduces this additional amino acid in 
the coding sequence. The sequence of the mature protein is first subcloned beginning with an ATG 
to optimize expression levels. Subsequent immunological assays in mice are performed with the 
protein to investigate if the extra-methionine can cause immunogenic response or low bioactivity. 
10 Alternatively, different systems can produce the mature protein. These systems can include the 
synthesis of a protein fused to a peptide that is cleaved intracellulary (processed) by chloroplast 
enzymes or the use of chemical or enzymatic cleavage after partial purification of proteins from plant 
cells. 

15 Use of peptides that are cleaved in chloroplast: Staub et al. (31) reported chloroplast expression 
of human somatotropin similar to the native human protein by using ubiquitin fusions that were 
cleaved in the stroma by an ubiquitin protease. However, theprocessingefficiencyrangedfrom30 - 
80% and the cleavage site was not accurate. To process chloroplast expressed proteins a peptide 
which is cleaved in the stroma is essential. The transit peptide sequence of the RuBisCo (ribulose 

20 1 ,5-bisphosphate carboxylase) small subunit is an advantageous choice. This transit peptide has been 
studied in depth (66). RuBisCo is one of the proteins that is synthesized in cytoplasm and transported 
postranslationally into the chloroplast in an energy dependent process. The transit peptide is 
proteolyticallyremovedupon transport in the stroma by the stromal processing peptidase (67). There 
are several sequences described for different species (68). A transit peptide consensus sequence for 

25 the RuBisCo small subunit of vascular plants is published by Keegstra et al. (69). The amino acids 
that are proximal to the C-teiminal (41-59) are highly conserved in the higher plant transit sequences 
andbelong to the domain which is involved in enzymatic cleavage (66). The RuBisCo small subunit 
transit peptide has been fused with various marker proteins (69,70), even with animal proteins 
(71,72), to target proteins to the chloroplast 

30 Prior to transformation studies, cleavage efficiency and accuracy is tested by in vitro 

translation of the fusion protein and in organello import studies using intact chloroplasts. Once the 
correct fusion sequence for producing the mature protein is known, such sequence encoding the 
amino terminal portion of tobacco chloroplast transit peptide is linked with the mature sequence of 
the protein. Codon composition of the tobacco RuBisCo small subunit transit peptide appears to be 

35 compatible with chloroplast optimal translation (see section 3 and Table on page 13). Additional 
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transit peptide sequences for targeting and cleavage in the chloroplast have been described (66). In 
cases where the RuBisCo small subunit transit peptide is not suitable, other transit peptides with 
cleavage in stroma can be used. The lumen of thylakoids is a good target because thylakoids are easy 
to purify. It is relatively easy to free lumenal proteins either by sonication or with a very low triton 
XI 00 concentration. However, this often requires insertion of additional amino acid sequences for 
efficient import (66). 

Use of chemical or enzymatic cleavage: The strategy of fusing a protein to a tag with affinity for 
a certain ligand has been used to enable affinity purification of recombinant products (73 - 75). 
However, scale up of this technologyis usually quite expensive. A vast number of cleavage methods, 
both chemical and enzymatic, have been investigated for this purpose (75). Chemical cleavage 
methods have low specificity and the relatively harsh cleavage conditions can result in chemical 
modifications of thereleased products (75). Some of the enzymatic methods offer significantlyhigher 
cleavage specificities together with high efficiency, e. g. H64A subtilisin, IgA protease and factor Xa 
(74,75), but these enzymes have the drawback of being quite expensive. 

Trypsin, which cleaves C-terminal ofbasic amino-acid residues, has been used for a long time 
to cleave fusion proteins (3,76), Despite expected low specificity, trypsin has been shown to be 
useful for specific cleavage of fusion proteins, leaving basic residues within folded protein domains 
uncleavaged (76). The use of trypsin only requires that the N-tenninus of the mature protein be 
accessible to the protease and that the potential internal sites are protected in the native conformation. 
Trypsin has the additional advantage ofbeing inexpensive and readily available. In the case of HS A, 
when it was expressed in E. coli with 6 additional codons coding for a trypsin cleavage site, HS A was 
processed successfully into the mature protein after treatment with the protease. In addition, the N- 
terminal sequence was found to be unique and identical to the sequence of natural HS A, the 
conversion was complete and no degradation products were observed (3). This in vitro maturation 
is selective because correctly folded albumin is highly resistant to trypsin cleavage at inner sites (3). 
This system could be tested for chloroplasts HSA vectors using protein expressed in E. coli. 

Staub et al. (3 1) demonstrated that the chloroplast methionine aminopeptidase is active and 
they found 95% of removal of the first methionine of an ATG-somatotropin protein that was 
expressed via the chloroplast genome. There are several investigations that have shown a very strict 
pattern of cleavage by this peptidase (77), Methionine is only removed when second residues are 
glycine, alanine, serine, cysteine, threonine, proline or valine, but if the third amino acid is proline the 
cleavage is inhibited. For HSA the second aminoacid is aspartic acid, so the cleavage may not be 
possible. 
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3) Optimization of gene expression: We have reported that foreign genes are expressed 
between 3% (ay2Aa2) and 46% (cty2Aa2 operon) in transgenic chloroplasts (26,29). Based on the 
outcome of the evaluation of HSA chloroplast transgenic plants, several approaches will be used to 
enhance translation of the recombinant proteins. In chloroplasts, transcriptional regulation of gene 
5 expression is less important, although some modulations by light and developmental conditions are 
observed (78). RNA stability appears to be one among the least problems because of observation of 
excessive accumulation of foreign transcripts, at times 1 6,966-fold higher than the highly expressing 
nuclear transgenic plants (79). Chloroplast gene expression is regulated to a large extent at the post- 
transcriptional level. For example, 5* UTRs are necessary for optimal translation of chloroplast 

1 0 mRNAs. Shine-Dalgarno (GGAGG) sequences as well as a stem-loop structure located 5 * adjacent 
to the SD sequence are required for efficient translation. A recent study has shown that insertion of 
the psbA 5 5 UTR downstream of the 16S rRNA promoter enhanced translation of a foreign gene 
(GUS) hundred-fold (80). Therefore, the 200-bp tobacco chloroplast DNA fragment (1680-1480) 
containing 5' psbA UTR is used. This PCR product is then inserted downstream of the 16S rRNA . 

1 5 promoter to enhance translation of the recombinant proteins. 

Yet another approach for enhancement of translation can be the codon composition 
optimization. It is reasonable to expect efficient expression in chloroplasts since the protein is 
translated in E. colt. Although rbcL (RuBisCO) is the most abundant protein on earth, it is not 
translated as highly as the psbA gene due to the extremely high turnover of the psbA gene product. 

20 The psbA gene is under stronger selection for increased translation efficiency and is the most 
abundant thylakoid protein. In addition, the codon usage in higher plant chloroplasts is biased 
towards the NNC codon of 2-fold degenerate groups (i.e. TTC over TTT, GAC over GAT, CAC 
over CAT, AAC over AAT, ATC over ATT, ATA etc.). This is in addition to a strong bias towards 
T at third position of 4-fold degenerate groups. There is also a context effect that should be taken 

25 into consideration while modifying specific codons. The 2-fold degenerate sites immediately 
upstream from a GNN codon do not show this bias towards NNC. (TTT GGA is preferred to TTC 
GGA while TTC CGT is preferred to TTT CGT, TTC AGT to TTT AGT and TTC TCT to TTT 
TCT)(81,82). 

In addition, highly expressed chloroplast genes use GNN more frequently than other genes. 

30 The disclosure found in web site http://www.kazusa.or.jp/codon was used to analyze codon 
composition by comparing different species. Abundance of amino acids in chloroplasts and tRNA 
anticodons present in chloroplast was taken into consideration. We also compared A+T% content 
of all foreign genes that had been expressed in transgenic chloroplasts in our laboratory with the 
percentage of chloroplast expression. We found that higher levels of A+T always correlated with 

35 high expression levels (see Table 1). The HSA sequence showed 57% of A+T content and 40% of 
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the total codons matched with the psbA most translated codons. According to the data of the Table 
and taking into consideration all these factors, good chloroplast expression of the HS A gene without 
modifications in its codon composition can be expected. 
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4) Vector constructions: pLD vector is used for the constructs. This vector was developed 
for chloroplast transformation. It contains the 16S rRNA promoter (Prrn) driving the selectable 
marker gene aadA (aminoglycoside adenyl transferase conferring resistance to spectinomycin) 
followed by the psbA 3 f region (the terminator from a gene coding for photosystem II reaction center 

20 components) from the tobacco chloroplast genome. The pLD vector is a universal chloroplast 
expression /integration vector and can be used to transform chloroplast genomes of several other 
plant species (14,27) because these flanking sequences are highly conserved among higher plants. 
The universal vector uses tmA and trnl genes (chloroplast transfer RNAs coding for Alanine and 
Isoleucine) from the inverted repeat region of the tobacco chloroplast genome as flanking sequences 

25 for homologous recombination. Because the universal vector integrates foreign genes within the 
Inverted Repeat region of the chloroplast genome, it doubles the copy number of the transgene (from 
5000 to 1 0,000 copies per cell in tobacco). Furthermore, it has been demonstrated that homoplasmy 
is achieved even in the first round of selection in tobacco probably because of the presence of a 
chloroplast origin of replication within the flanking sequence in the universal vector (thereby 

30 providing more templates for integration). Because of these and several otherreasons, foreign gene 
expression was shown to be much higher when the universal vector was used instead of the tobacco 
specific vector (30). 

The following vectors can be used to optimize protein expression, purification and production 
of HSA with the same amino acid composition as the human protein. 
35 a) We increase translation using the psbA 5 'UTR to optimize expression. The 200 bp tobacco 
chloroplast DNA fragment containing 5 ' psbA is amplified by PCR using tobacco chloroplast . 
DNA as template. This fragment is cloned directly in the pLD vector multiple cloning site 
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(EcoRI-Ncol) downstream of the promoter and the aadA gene. The cloned sequence is the 
same as in the psbA gene. 

b) For enhancing protein stability and facilitating purification, the cry2Aa2 Bacillus thuringiensis 
operon derived putative chaperonin is used. Expression of the cry2Aa2 operon in 

5 chloroplasts provides a model system for hyper-expression of foreign proteins (46% of total 

soluble protein) in a folded configuration enhancing their stability and facilitatingpurification 
(29). This justifies inclusion of the putative chaperonin from the cry2Aa2 operon in one of 
the newly designed constructs. In this region there are two open reading frames (ORF1 and 
ORF2) and a ribosomal binding site (rbs). This sequence contains elements necessary for 

10 Cry2Aa2 crystallization which help to fold or crystallize the HSA protein helping in the 

subsequent purification. Successful crystallization of other proteins using this putative 
chaperonin has been demonstrated (35). We amplify the ORF 1 and ORF2 of the Bt Cry2Aa2 
operon by PCR using the complete operon as template. The fragment is cloned into a PCR 
2.1 vector and excised as an EcoRI-EcoRV product. This fragment is then cloned directly 

1 5 into the pLD vector multiple cloning site (EcoRI-EcoRV) downstream of the promoter and 

the aadA gene. 

c) To obtain HSA with the same amino acid composition as the mature human protein (without 
the extra methionine), we first fuse HSA with the RuBisCo small subunit transit peptide. 
Also, other constructions are performed to allow cleavage of the protein after isolation from 

20 chloroplast 

The first set of constructs includes the sequence of HSA beginning with an ATG, introduced 
by PCR using primers. Once optimal expression levels are achieved, and when the ATG is shown 
to be a problem (determined by mice immunological assays), processing to produce the mature 
protein is addressed. The first attempt is the use oftheRuBisCo small subunit transit peptide. This 

25 transit peptide is amplified by PCR using tobacco DNA as a template and cloned into the PCR 2. 1 
vector. The HSA gene is fused with the transit peptide using a Mlul restriction site that is introduced 
in the PCR primers for amplification of the transit peptide and the HSA coding sequence. The gene 
fusion is then inserted into the pLD vector downstream of the 5 'region that gives optimal expression 
of HSA (RBS, STJTRpsbA, ORF1+2, ORFR2-5XJTRpsbA). Another approach to eliminate the 

3 0 ATG of the coding region is the use of the ATG before a protease recognition sequence, like trypsin, 
and remove in vitro such extra sequence to obtain the mature protein. Such sequences will be 
introduced by primers in a PCR. After completing vector constructions, the vectors are sequenced 
to confirm correct nucleotide sequence and in frame fusion. DNA sequencing is performed using a 
Perkin Elmer ABI prism 373 DNA sequencing system or the like. 

35 
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Because of the similarity of protein synthetic machinery (64), expression of chloroplast 
vectors is first tested in E.coli before their use in tobacco transformation. For Escherichia coli 
expression XL-1 Bine strain is used. Purification and cleavage assays is performed using K coli 
expressed protein. 

S) Bombardment, Regeneration and Characterization of Chloroplast Transgenic Plants: 

Tobacco (Nicotiana tabacum var. Petit Havana) plants are grown aseptically by germination of seeds 
on MSO medium. This medium contains MS salts (4.3 g/Iiter), B5 vitamin mixture (myo-inositol, 
100 mg/liter; thiamine-HCl, 10 mg/liter; nicotinic acid, 1 mg/liter; pyridoxine-HCl, 1 mg/liter), 
sucrose (30 g/liter) and phytagar (6 g/liter) at pH 5.8. Fully expanded, dark green leaves of about 
two month old plants are used for bombardment 

Leaves are placed abaxial side up on a Whatman No. 1 filter paper laying on the RMOP 
medium(20)in^dardpetriplates(100xl5mm)forbombardment Gold (0.6 jim)microprojectiles 
are coated with plasmid DNA (chloroplast vectors) and bombardments are carried out with the 
biolistic device PDS1000/He(Bio-Rad) asdescribedbyDaniell(65). Following bombardment, petri 
plates are sealed with parafilm and incubated at 24°C under 12 h photoperiod. Two days after 
bombardment, leaves are chopped into small pieces of -5 mm 2 in size and placed on the selection 
medium (RMOP containing 500 fig/ml of spectinomycin dihydrochloride) with abaxial side touching 
the medium in deep (100x25 mm) petri plates (-10 pieces per plate). The regenerated spectinomycin 
resistant shoots are chopped into small pieces (~2rnm 2 ) and subcloned into fresh deeppetriplates (-5 
pieces per plate) containing the same selection medium. Resistant shoots from the second culture 
cycle are transferred to the rooting medium (MSO medium and spectinomycin dihydrochloride, 500 
mg/liter). Rooted plants are transferred to soil and grown at 26°C under 16 hour photoperiod 
conditions for further analysis. 

PCR analysis of putative transfo rmants : PCR is performed using DNA isolated from control and 
transgenic plants to distinguish a) true chloroplast transformants from mutants and b) chloroplast 
transfonnants from nuclear transformants. Primers for testing the presence of the aadA gene (that 
confers spectinomycin resistance) in transgenic plants are landed on the aadA coding sequence and 
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16S rRNA gene. One primer lands on the aadA gene while another lands on the native chloroplast 
genome as shown in Fig. 10A to test chloroplast integration of the genes. No PCR product is 
obtained with nuclear transgenic plants using this set of primers. The primer set is used to test 
integration of the entire gene cassette without any internal deletion or looping out during homologous 
recombination. A similar strategy has been used successfully by us to confirm chloroplast integration 
of foreign genes (26-30). This screening is essential to eliminate mutants and nuclear transformants. 
Total DNA from unbombarded and transgenic plants is isolated as described by Edwards et al. (83) 
to conduct PCR analyses in transgenic plants. Chloroplast transgenic plants containing the desired 
gene are moved to second round of selection to achieve homoplasmy. 

Southern Analysis for homoplasmy and copy number: Southern blots are performed to determine 
the copy number of the introduced foreign gene per cell as well as to test homoplasmy. There are ' 
several thousand copies ofthechloroplastgenomepresentineachplant cell. Therefore, when foreign 
genes axe inserted into the chloroplast genome, it is possible that some of the chloroplast genomes 
have foreign genes integrated while others remain as the wild type (heteroplasmy). Therefore, to 
ensure that only the transformed genome exists in cells of transgenic plants (homoplasmy), the 
selection process is continued. Total DNA from transgenic plants are probed with the chloroplast 
border (flanking) sequences (the trnl-trnA fragment) to confirm that the wild type genome does not 
exist at the end of the selection cycle. If wild type genomes are present (heteroplasmy), the native 
fragment size is observed along with transformed genomes. Presence of a large fragment (due to 
insertion of foreign genes within the flanking sequences) and absence of the native small fragment 
confirms homoplasmy (26,27,30). 

The copy number of the integrated gene is determined by establishing homoplasmy for the 
transgenic chloroplast genome. Tobacco chloroplasts contain 5000~1 0,000 copies of their genome 
per cell (27). When only a fraction of the genomes are actually transformed, the copy number, by 
default, must be less than 10,000. By establishing that in the transgenics the gene inserted • 
transformed genome is the only one present, it can be established that the copy number is about 
5000-1 0,000 per cell. This is usually done by digesting the total DNA with a suitable restriction 
enzyme and probing with the flanking sequences that enable homologous recombination into the 
chloroplast genome. The native fragment present in the control should be absent in the transgenics. 
The absence of native fragment proves that only the transgenic chloroplast genome is present in the 
cell and there is no native, untransformed, chloroplast genome, without the foreign gene present 
This establishes the homoplasrnic nature of our transformants, simultaneously providing us with an 
estimate of about 5000-10,000 copies of the foreign genes per cell. 
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Northern Analysis for transcript stability: Northern blots are performed to test the efficiency of 
transcription of the genes. Total RNA is isolated from ISOmgof frozen leaves by using the "Rneasy 
Plant Total RNA Isolation Kit" (Qiagen Inc., Chatsworth, CA). RNA (10-40 jxg) is denatured by 
formaldehyde treatment, separated on a 1,2% agarose gel in the presence of formaldehyde and 
5 transferred to a nitrocellulose membrane (MSI) as described in Sambrook et al. (84). Probe DNA 
(HSA gene coding region) is labeled by the random-primed method (Promega) with 32 P-dCTP 
isotope. The blot is pre-hybridized, hybridized and washed as described above for southern blot 
analysis. Transcript levels are quantified by the Molecular Analyst Program using the GS-700 
Imaging Densitometer (Bio-Rad, Hercules, CA) or the like. 

10 

Expression and quantification of the total protein expressed in chloroplast: Chloroplast 
expression assays are performed by Western Blot. Recombinant protein levels in transgenic plants 
of first and second generation (To and Tl) are determined using quantitative ELIS A assays. A 
standard curve is generated using known concentrations and serial dilutions of recombinant and native 
15 proteins. Different tissues are analyzed using young, mature and old leaves against goat anti-HSA 
(N ordic Immunology) antibodies. Bound IgG is measured using horseradish peroxidase-labelled anti- 
goat IgG (Sigma). 

Inheritance of Introduced Foreign Genes: While it is unlikely that introduced DNA moves from 
20 the chloroplast genome to nuclear genome, it is possible that the gene can be integrated in the nuclear 
genome during bombardment and remain undetected in Southern analysis. Therefore, in initial 
tobacco transformants, some is allowed to self-pollinate, whereas others are used in reciprocal crosses 
with control tobacco (transgenics as female accepters and pollen donors; testing for maternal 
inheritance). Harvested seeds (Tl) are germinated on media containing spectinomycin. Achievement 
25 of homoplasmy and mode of inheritance can be classified by looking at germination results. 
Homoplasmy can be indicated by totally green seedlings (27) while heteroplasmy is displayed by 
variegated leaves (lack of pigmentation, 24). Lack of variation in chlorophyll pigmentation among 
progeny also underscores the absence of position effect, an artifact of nuclear transformation. 
Maternal inheritance is demonstrated by sole transmission of introduced genes via seed generated on 
30 transgenic plants, regardless of pollen source (green seedlings on selective media). When transgenic 
pollen is used for pollination of control plants, resultant progeny do not contain resistance to chemical 
in selective media (will appear bleached; 24). Molecular analyses confirm transmission and 
expression of introduced genes, and T2 seed is generated from those confirmed plants by the 
analyses described above. 
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6) Purification methods : The standard method of purification employs classical biochemical 
techniques with the crystallized proteins inside the chloroplast. In this case, the homogenates are 
passed through miracloth to remove cell debris. Centrifugation at 10,000 xg pellets all foreign 
proteins (29). Proteins are solubilized using pH, temperature gradient, etc. This is possible if the 
ORF1 and 2 of the ay2Aa2 operon can fold and crystallize the recombinant protein. When there is 
no crystal formation, other purification methods are applied (classical biochemistry techniques). 
Albumin is typically administered in tens of gram quantities. At a purity level of 99.999% (a level 
considered sufficient for other recombinant protein preparations), recombinant HSA (rHSA) 
impurities on the order of one mg is still injected into patients. Hence, impurities from the host 
organism must be reduced to a minimum. Furthermore, purified rHSA must be identical to human 
HSA. Despite these stringent requirements, purification costs must be kept low. It is not appropriate 
to apply conventional processes for purifying HSA originating in plasma to purify the HSA obtained 
by gene manipulation. This is because the impurities to be eliminated from rHSA differ from those 
contained in the HSA originating in plasma. Namely, rHSA is contaminated with, for example, 
coloring matters characteristic to recombinant HSA, proteins originating in the host cells, 
polysaccharides, etc. In particular, it is necessary to sufficiently eliminate components originating in 
the host cells, since they are foreign matters for living organisms including human and can cause the 
problem of antigenicity. 

la plants, two different methods ofHS A purification have been performed at laboratory scale. 
Sijmons et al. (12) transformed potato and tobacco plants vriihAgrobacterium tumefaciens. For die 
extraction and purification ofHSA, 1 000 g of stem and leaf tissue was homogenized in 1 000 ml cold 
PBS, 0.6% PVP, 0.1 mM PMSF and 1 mM EDTA. The homogenate was clarified by filtration, 
centrifuged and the supernatant incubated for 4 h with 1.5 ml polyclonal antiHSA coupled to 
Reactigel spheres (Pierce Chem) in the presence of 0.5% Tween 80. The complex HSA-anti HSA- 
Reactigel was collected and washed with 5 ml 0.5% Tween 80 in PBS. HSA was desorbed from the 
reactigel complex with 2.5 ml of 0.1 M glycine pH 2.5, 10% dioxane, immediately followed by a 
buffer exchange with Sephadex G25 to 50 mM Tris pH 8. The sample was then loaded on a HR5/5 
MonoQ anion exchange column (Pharmacia) and eluted with a linear NaCl gradient (0-350 mM 
NaCl in 50 mM Tris pH 8 in 20 min at lml/min). Fractions containing the concentrated HSA (at 290 
mM NaCl) were lyophilized and applied to a HR 10/30 Sepharose 6 column (Pharmacia) in PBS at 
0.3 ml/min. However, this method uses affinity columns (polyclonal anti-HSA) that are very 
expensive to scale-up. Also, the protein is released from the column with 0. 1M glycine pH 2.5 that 
typically denatures the protein. Therefore, this method can be suitably modified. 

The second method is used for HSA extraction and purification from potato tubers (Dr. 
Mingo-CastePs laboratory). After grinding the tuber in phosphate buffer pH 7.4 (1 mg/2ml), the 
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homogenate is filtered in miracloth and centrifiiged at 14,000 rpm 15 minutes. After this step, 
another filtration of the supernatant in 0.45 jum filters is necessary. Then, chromatography of ionic . 
exchange in FPLC using a DEAE Sepharose Fast Flow column (Amersham) is required. Fractions 
recovered are passed through an affinity column (Blue Sepharose fast flow Amersham) resulting in 
5 a product of high purity. HSA purification based on both methods can then be investigated. 

7) Characterization of the recombinant proteins: For the safe use of recombinant proteins 
as a replacement in any of the current applications, these proteins must be structurally equivalent and 
must not contain abnormal host-derived modifications. To confirm compliance with these criteria 
10 human and recombinant proteins can be compared using the currently highly sensitive and highly 
resolving techniques expected by the regulatory authorities to characterize recombinant products 
(85). 

1- Amino acid analysis : N-terminal sequence analysis is performed by Edman degradation using 
ABI 477 A protein sequencer with an on-line 1 20 A phenylthiohydantoin-amino acid analyzer. 

1 5 Automated C-terminal sequence analysis uses a Hewlett-Packard G 1 009 A protein sequencer. 

The C-terminal tryptic peptide is isolated from tryptic digests by reverse-phase HPLC to 
confirm the C-terminal sequence to a greater number of residues. 

2- Protein folding and disulfide bridges formation : Western blots with reducing and non- 
reducing gels is performed to check protein folding. Protein standards (Sigma) are loaded 

20 to compare the mobility of the recombinant protein. PAGE is performed on PhastGels 

(Pharmacia Biotech). Proteins are blotted and then probed with goat anti-HS A antibodies. 
Bound IgG is detected with horseradish peroxidase-labelled anti goat IgG and visualized on 
X-ray film using ECL detection reagents (Amersham). 

3- Chromatographic techniques : For HSA, analytical gel-permeation HPLC is performed using 
25 a TSK G3000 SWxl column. Preparative gel permeation chromatography of HSA is 

performed using a Sephacryl S200 HR column. The monomer fraction, identified by 
absorbance at 280 nm, is dialyzed and reconcentrated to its starting concentration. 

4- Viscosity : This is a classical assay for recombinant HSA. Viscosity is a characteristic of 
proteins related directly to their size, shape, and conformation. The viscosities of HSA and 

30 recombinant HSA are measured at 100 mgMl-1 in0.15 MNaClusingaU-tubeviscosimeter 

(M2 type, Poulton, Selfe and Lee Ltd, Essex, UK) at 25°C. 

5- Glycosylation : Chloroplast proteins are not known to be glycosylated. However there are 
no publications to confirm or refute this assumption. Therefore, glycosylation will be 
measured using a scaled-up version of the method of Ahmed and Furth (86). 

35 
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8) Animal testing and Pre-Clinical Trials: When albumin is produced at adequate levels in 
tobacco and thephysicochemical properties of the product correspond to those of the natural protein, 
toxicology studies need to be done in mice. To avoid mice response to the human protein, transgenic 
mice carrying HSA genomic sequences is used (87). After injection of none, 1, 10, 50 and 100 mg 
5 of purified recombinant protein, classical toxicology studies are carried out (body weight and food 
intake, animal behavior, piloerection, etc.). Albumin can be tested for blood volume replacement 
after paracentesis to eliminate the fluid from the peritoneal cavity in patients with liver cirrhosis. It 
has been shown that albumin infusion after this maneuver is essential to preserve effective circulatory 
volume and renal function (88). 
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EXPRESSION OF HUMAN THERAPEUTIC PROTEINS 
IN TRANSGENIC TOBACCO CHLOROPLASTS 

BACKGROUND OF THE INVENTION 
5 FIELD OF THE INVENTION 

The present invention is directed to the expression of genes in plants to produce 
recombinant proteins. 

DESCRIPTION OF RELATED ART 

Research on human proteins in the past years has revolutionized the use of these 

0 therapeutically valuable proteins in a variety of clinical situations. Since the demand for these 
proteins is expected to increase considerably in the coming years, it would be wise to ensure that 
in the future they will be available in significantly larger amounts, preferably on a cost-effective 
basis. Because most genes can be expressed in many different systems, it is essential to 
determine which system offers the most advantages for the manufacture of the recombinant 

5 protein. The ideal expression system would be one that produces a maximum amount of safe, 
biologically active material at a minimum cost. The use of modified mammalian cells with 
recombinant DNA techniques has the advantage of resulting in products which are closely 
related to those of natural origin; however, culturing of these cells is intricate and can only be 
carried out on limited scale. The use of microorganisms such as bacteria permits manufacture on 

0 a larger scale, but introduces the disadvantage of producing products, which differ appreciably 
from the products of natural origin. For example, proteins that are usually glycosylated in 
humans are not glycosylated by bacteria. Furthermore, human proteins that are expressed at high 
levels in E. coli frequently acquire an unnatural conformation, accompanied by intracellular 
precipitation due to lack of proper folding and disulfide bridges. Production of recombinant 

5 proteins in plants has many potential advantages for generating biopharmaceuticals relevant to 
clinical medicine. These include the following: (I) plant systems are more economical than 
industrial facilities using fermentation systems; (ii) technology is available for harvesting and 
processing plants/ plant products on a large scale; (iii) elimination of the purification requirement 
when the plant tissue containing the recombinant protein is used as a food (edible vaccines); (iv) 

D plants can be directed to target proteins into stable, intracellular compartments as chloroplasts, or 
expressed directly in chloropiasts; (v) the amount of recombinant product that can be produced 
approaches industrial-scale levels; and (vi) health risks due to contamination with potential 
human pathogens/toxins are minimized. 

It has been estimated that one tobacco plant should be able to produce more recombinant 

5 protein than a 300-liter fermenter of E. coli. In addition, a tobacco plant produces a million 
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ing large-scale production. Tobacco is also an ideal choice because of its relative 
ease of genetic manipulation and an impending need to explore alternate uses for this hazardous 
crop. However, with the exception of enzymes (e.g. phytase), levels of foreign proteins produced 
in nuclear transgenic plants are generally low, mostly less than 1% of the total soluble protein 
5 (1). May et al. (2a) discuss this problem using the following examples. Although plant derived 
recombinant hepatitis B surface antigen was as effective as a commercial recombinant vaccine, 
the levels of expression in transgenic tobacco were low (0.0066% of total soluble protein). Even 
though Norwalk virus capsid protein expressed in potatoes caused oral immunization when 
consumed as food (edible vaccine), expression levels were low (0.3% of total soluble protein). In 

1 0 particular, expression of human proteins in nuclear transgenic plants has been disappointingly 
low: e.g. human Interferon-D 0.000017% of fresh weight, human serum albumin 0.02% and 
erythropoietin 0.0026% of total soluble protein (see table 1 in refl). A synthetic gene coding for 
the human epidermal growth factor was expressed only up to 0.001% of total soluble protein in 
transgenic tobacco (2a). The cost of producing recombinant proteins in alfalfa leaves was 

15 estimated to be 12-fold lower than in potato tubers and comparable with seeds (1). However, 
tobacco leaves are much larger and have much higher biomass than alfalfa. The cost of 
production of recombinant proteins will be 50-fold lower than that of E.coli fermentation (with 
20% expression levels, 1). A decrease in insulin expression from 20% to 5% of biomass doubled 
the cost of production (2b). Expression level less than 1% of total soluble protein in plants has 

20 been found to be not commercially feasible (1). Therefore, it is important to increase levels of 
expression of recombinant proteins in plants in order to exploit plant production of 
pharmacologically important proteins. 

An alternate approach is to express foreign proteins in chloroplasts of higher plants. We 
have recently integrated foreign genes (up to 10,000 copies per cell) into the tobacco chloroplast 

25 genome resulting in accumulation of recombinant proteins up to 47% of the total cellular protein 
(3). Chloroplast transformation utilizes two flanking sequences that, through homologous 
recombination, insert foreign DNA into the spacer region between the functional genes of the 
chloroplast genome, thus targeting the foreign genes to a precise location. This eliminates the 
"position effect" and gene silencing frequently observed in nuclear transgenic plants. 

30 Chloroplast genetic engineering is an environmentally friendly approach, minimizing concerns 
of out-cross of introduced traits via pollen to weeds or other crops. Also, the concerns of insects 
developing resistance to biopesticides are minimized by hyper-expression of single insecticidal 
proteins (high dosage) or expression of different types of insecticides in a single transformation 
event (gene pyramiding). Concerns of insecticidal proteins on non-target insects are minimized 

35 by lack of expression in transgenic pollen. Most importantly, a significant advantage in the 
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pharmaceutical proteins in chloroplasts is their ability to process eukaryotic 
proteins, including folding and formation of disulfide bridges (4). Chaperonin proteins are 
present in chloroplasts (5,6) that function in folding and assembly of prokaryotic/eukaryotic 
proteins. Also, proteins are activated by disulfide bond oxido/reduction cycles using the 
5 chloroplast thioredoxin system (7) or chloroplast protein disulfide isomerase (8). Accumulation 
of fully assembled, disulfide bonded form of human somatotropin via chloroplast transformation 
(9) and oligomeric form of CTB (10) and assembly of heavy and light chains of humanized 
Guy's 13 antibody in transgenic chloroplasts (11) provide strong evidence for successful 
processing of pharmaceutical proteins inside chloroplasts. Such folding and assembly should 
10 eliminate the need for highly expensive in vitro processing of pharmaceutical proteins. For 
example, 60% of the total operating cost in the production of human insulin is associated with in 
vitro processing (formation of disufide bridges and cleavage of methionine)(2b). 

Taken together, low levels of expression of human proteins in nuclear transgenic plants, 
and difficulty in folding, assembly/processing of human proteins in E.coli should make 
1 5 chloroplasts an ideal compartment for expression of these proteins; production of human proteins 
in transgenic chloroplasts should also dramatically lower the production cost. Large-scale 
production of these proteins in plants should be a powerfid approach to provide treatment to 
patients at an affordable cost and provide tobacco formers alternate uses for this hazardous crop. 
Therefore, we propose here expression of therapeutic proteins in transgenic tobacco chloroplasts 
20 to increase levels of expression and accomplish in vivo processing. 

BRIEF DESCRIPTION OF THE FIGURES 
Figure 1 is a graphical representation of total protein versus leaf age in transgenic tobacco 
plants. 

Figure 2 is an electron micrograph showing Cry2Aa2 crystals in a transgenic tobacco leaf. 
25 Figure 3 is a photograph of leaves infected with P. syringae 5 days after inoculation. 
Figure 4 is a graph showing the results of an in vitro assay of P. aeruginosa. 
Figure 5 is two graphs showing oligomeric CTB expression levels as Total Soluble Protein. 
Figure 6 is a Western Blot Analysis of transgenic chloroplast expressed CTB and commercially 
available purified CTB antigen. 
30 Figure 7 is a Western Blot Analysis of heavy and light chains of Guy's 13 monoclonal antibody 
from plant chloroplasts. 
Figure 8 is a Western Blot of transgenic potato tubers, cv Desiree. 
Figure 9 is a frequency histogram including percentage Kennebec and D6sir6e transgenci 
plants expressing different HAS levels. 
35 Figure 10 is a Western Blot of E. coli protein extracts. 
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HUMAN SERUM ALBUMIN 

HSA is a monomelic globular protein and consists of a single, generally nonglycosylated, 
polypeptide chain of 585 amino acids (66.5 KDa and 17 disulfide bonds) vyith no 
postradiational modifications. It is composed of three structurally similar globular domains and 
5 the disulfides are positioned in repeated series of nine loop-link-loop structures centered around 
eight sequential Cys-Cys pairs. HSA is initially synthesized as pre-pro-albumin by the liver and 
released from the endoplasmatic reticulum after removal of the aminoterminal propeptide of 18 
amino acids. The pro-albumin is further processed in the Golgi complex where the other 6 
aminoterminal residues of the propeptide are cleaved by a serine proteinase (12). This results in 

10 the secretion of the mature polypeptide of 585 amino acids. HSA is encoded by two codominant 
autosomic allelic genes. HSA belongs to the multigene family of proteins that include alpha- 
fetoprotein and human group-specific component (Gc) or vitamin D-binding family. HSA 
facilitates transfer of many Hgands across organ circulatory interfaces such as in the liver, 
intestine, kidney and brain. In addition to blood plasma, serum albumin is also found in tissues. 

15 HSA accounts for about 60% of the total protein in blood serum. In the serum of human adults, 
the concentration of albumin is 40 mg/ml. 

The primary function of HSA is the maintenance of colloid osmotic pressure (COP) 
within the blood vessels. Its abundance makes it an important determinant of the 
pharmacokinetic behavior of many drugs. Reduced synthesis of HSA can be due to advanced 

20 liver disease, impaired intestinal absorption of nutrients or poor nutritional intake. Increased 
albumin losses can be due to kidney diseases (increased glomerular permeability to 
macromolecules in the nephrotic syndrome), intestinal diseases (protein-losing enteropathies) or 
exudative skin disorders (bums). Catabolic states such as chronic infections, sepsis, surgery, 
intestinal resection, trauma or extensive burns can also cause hypoalbuminemia. HSA is used in 

25 therapy of blood volume disorders, for example posthaemorrhagic acute hypovolemia or 
extensive burns, treatment of dehydration states, and also for cirrhotic and hepatic illnesses. It is 
also used as an additive in perfusion liquid for extracorporeal circulation. HSA is used clinically 
for replacing blood volume, but also has a variety of non-therapeutic uses, including its role as a 
stabilizer in formulations for other therapeutic proteins. HSA is a stabilizer for biological 

30 materials in nature and is used for preparing biological standards and reference materials. 
Furthermore, HSA is frequently used as an experimental antigen, a cell-culture constituent and a 
standard in clinical-chemistry tests. 

The expression and purification of recombinant HSA from various microorganisms has 
been reported previously (13-17). Saccharornyces cerevisiae has been used to produce HSA both 

35 intracellulary, requiring denaturation and refolding prior to analysis (18), and by secretion (19). 
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was equivalent structurally, but the recombinant product had lower levels of 
expression (recovery) and structural heterogeneity compared to the blood derived protein (20). 
HSA was also expressed in Kluyveromyces lactis, a yeast with good secretary properties 
achieving 1 g/liter in fed batch cultures (21). Ohtani et al (22) developed a HSA expression 
5 system using Pichia pastoris and established a purification method obtaining recombinant 
protein with similar levels of purity and properties as the human protein. In Bacillus subtilis, 
HSA could be secreted using bacterial signal peptides (15). HSA production in K coli was 
successful but required additional in vitro processing with trypsin to yield the mature protein 
(14). Sijmons et al. (23) expressed HSA in transgenic potato and tobacco plants. Fusion of HSA 

10 to the plant PR-S presequence resulted in cleavage of the presequence at its natural site and 
secretion of correctly processed HSA, that was indistinguishable from the authentic human 
protein. The expression was 0.014% of the total soluble protein. However, none of these methods 
have been exploited commercially. 

Albumin is currently obtained by protein fractionation from plasma and is the world's 

15 most used intravenous protein, estimated at around 500 metric tons per year. Albumin is 
administered by intravenous injection of solutions containing 20% of albumin. The average 
dosage of albumin for each patient varies between 20-40 grams/day. The consumption of 
albumin is around 700 kilograms per million habitants per year. In addition to the high cost, HSA 
has the risk of transmitting diseases as with other blood-derivative products. The price of 

20 albumin is about $3.7/g. Thus, the market of this protein approximately amounts to $ 2,600,000 
per million people per year (0.7 billion dollars per year in USA). Because of the high cost of 
albumin, synthetic macromolecuies (like dextrans) are used to increase plasma colloidosmotic 
pressure. 

Commercial HSA is mainly prepared from human plasma. This source, hardly meets the 
25 requirements of the world market. The availability of human plasma is limited and careful heat 
treatment of the product prepared must be performed to avoid potential contamination of the 
product by hepatitis, HIV and other viruses. The costs of HSA extraction from blood are very 
high. In order to meet the demands of the large albumin market with a safe product at a low cost, 
innovative production systems are needed Plant biotechnology offers promise of obtaining safe 
30 and cheap proteins to be used to treat human diseases. 

INTERFERON ALPHA 

Interferons (IFNs) constitute a heterogeneous family of cytokines with antiviral, 
antigrowth, and immunomodulatory properties (24-26). Type I IFNs are acid-stable and 
35 constitute the first line of defence against viruses, both by displaying direct antiviral effects and 
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with the cytokine cascade and the immune system. Their function is to induce 
regulation of growth and differentiation of T cells. The human IFN-a family consists of at least 
22 intronless genes, 9 of which are pseudogenes and 13 expressed genes (subtypes) (27). Human 
IFN-a genes encode proteins of 188 or 189 amino acids. The first 23 amino acids constitute a 
5 signal peptide, and the other 165 or 166 amino acids form the mature protein. IFN-a subtypes 
show 78-94 % homology at the nucleotide level. Presence of two disulfide bonds between Cys- 
l:Cys-99 and Cys-29:Cysl39 is conserved among all IFN-a species (28). Human IFN-a genes 
are expressed constitutively in organs of normal individuals (29,30). Individual IFN-a genes are 
differently expressed depending on the stimulus and they show restricted cell type expression 

10 (31). Although all IFN-a subtypes bind to a common receptor (32), several reports suggest that 
they show quantitatively distinct patterns of antiviral, growth inhibitory and immunomodulatory 
activities (33), IFN-a8 and IFN~a5 seem to have the greatest antiviral activity in liver tumour 
cells HuH7 (33). EFN-a5 has, at least, the same antiviral activity as IFN-a2 in in vitro 
experiments (unpublished data in Dr. Prieto's lab). It has been shown recently that IFN-a5 is the 

15 sole IFN-a subtype expressed in normal liver tissue (34). IFN-a5 expression in patients with 
chronic hepatitis C is reduced in the liver (34) and induced in mononuclear cells (35). 

Interferons are mainly known for their antiviral activities against a wide spectrum of 
viruses but also for their protective role against some non-viral pathogens. They are potent 
immunomodulators, possess direct antiproliferative activities and are cytotoxic or cytostatic for a 

20 number of different tumour cell types. IFN-a is mainly employed as a standard therapy for hairy 
cell leukaemia, metastasizing carcinoma and AIDS-associated angiogenic tumours of mixed 
cellularity known as kaposi sarcomas. It is also active against a number of other tumours and 
viral infections. For example, it is the current approved therapy for chronic viral hepatitis B 
(CHB) and C (CHC). The IFN-a subtype used for chronic viral hepatitis is IFN-a2. About 40% 

25 of patients with CHB and about 25% of patients with CHC respond to this therapy with sustained 
viral clearance. The usual doses of IFN-a are 5-10 MU (subcutaneous injection) three days per 
week for 4-6 months for CHB and 3 MU three days per week for 12 months for CHC. Three MU 
of IFNa2 represent approximately 15 Dg of recombinant protein. The response rate in patients 
with chronic hepatitis C can be increased by combining IFN-a2 and ribavirin. This combination 

30 therapy, which considerably increases the cost of the therapy and causes some additional side 
effects, results in sustained biochemical and virological remission in about 40-50% of cases. 
Recent data suggest that pegilated interferon in weekly doses of 180 Dg can also increase the 
sustained response rate to about 40%. IFN-a5 is the only IFN-a subtype expressed in liver, this 
expression is reduced in patients with CHC and IFN-a5 seems to have one of the highest 
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ty in liver tumour cells (see above). An international patent to use IFN-a5 has 
been filed by Prieto's group to facilitate commercial development (36). 

Human interferons are currently prepared in microbial systems via recombinant DNA 
technology in amounts which cannot be isolated from natural sources (leukocytes, fibroblasts, 
5 lymphocytes). Different recombinant interferon- □ genes have been cloned and expressed in E. 
coli (37a,b) or yeast (38) by several groups. Generally, the synthesized protein is not correctly 
folded due to the lack of disulfide bridges and therefore, it remains insoluble in inclusion bodies 
that need to be solubilized and refolded to obtain the active interferon (39,40). One of the most 
efficient methods of interferon-D expression has been published recently by Babu et al. (41). In 

1 0 this method, E. coli cells transformed with interferon vectors (regulated by temperature inducible 
promoters) were grown in high cell density cultures; this resulted in the production of 4 g 
interferon- Q/Iiter of culture. Expression resulted exclusively in the form of insoluble inclusion 
bodies which were solubilized under denaturing conditions, refolded and purified to near 
homogeneity. The yield of purified interferon-D was approximately 300mg/l of culture. 

15 Expression in plants via the nuclear genome has not been very successful. Smirnov et al. (42) 
obtained transformed tobacco plants with Agrobacteriitm tumefaciens using the interferon-D 
gene under 35S CaMV promoter but the expression level was very low. Eldelbaum et al. (43) 
showed tobacco nuclear transformation with Interferon-D and the expression level detected was 
0.000017% of fresh weight. 

20 The number of subjects infected with hepatitis C virus (HCV) is estimated to be 120 

million (5 million in Europe and 4 million in USA). Seventy per cent of the infected people have 
abnormal liver function and about one third of these have severe viral hepatitis or cirrhosis. It 
might be estimated however that there are about 10,000-15,000 cases of chronic infection with 
hepatitis B virus (HBV) in Europe, a slightly lower number of cases in USA. In Asia the 

25 prevalence of chronic HCV and HBV infection is very high (about 110 million of people are 
infected by HCV and about 150 millions are infected by HBV). In Africa HCV infection is very 
prevalent. Since unremitting chronic viral hepatitis leads to liver cirrhosis and eventually to liver 
cancer, the high prevalence of HBV and HCV infection in Asia and Africa accounts for their 
very high incidence of hepatocellular carcinoma. Based on these data, the need for IFN-a is 

30 large. IFN-ot2 is currently produced in microorganisms by a number of companies and the price 
of 3 MU (15 Dg) of recombinant protein in the western market is about $25. Thus, the cost of 
one year IFN-a2 therapy is about $ 4,000 per patient. This price makes this product unavailable 
for most of the patients in the world suffering from chronic viral hepatitis. Clearly methods to 
produce less expensive recombinant proteins via plant biotechnology innovations would be 
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3 antiviral therapy widely available. Besides, if IFN-oc5 is more efficient than IFN- 
a2, lower doses may be required. 

INSULIN-LIKE GROWTH FACTOR-I (IGF-I) 

5 The Insulin-like Growth Factor protein, IGF-I, is an anabolic hormone with a complex 

maturation process. A single IGF-I gene is transcribed into several mRNAs by alternative 
splicing and use of different transcription initiation sites (44-46). Depending on the choice of 
splicing, two immature proteins are produced: IGF-IA, expressed in several tissues and IGF-IB, 
mostly expressed in liver (45). Both pre-proteins produce the same mature protein. A and B 

10 immature forms have different lengths and composition, as their termini are modified post- 
translationally by glycosylation. However, these ends are processed in the last step of maturation. 
Mature IGF-I protein is secreted, not glycosylated and has three disulfide bonds, 70 amino acids 
and a molecular weight of 7.6 kD (47-49). Physiologically, IGF-I expression is induced by 
growth hormone (GH). Actually, the knock out of IGF-I in mice has shown that several functions 

15 attributed originally to GH are in fact mediated by IGF-I. GH production by adenohypofisis is 
repressed by feed-back inhibition of IGF-I. GH induces IGF-I synthesis in different tissues, but 
mostly in liver, where 90% of IGF-I is produced (48). The IGF-I receptor is expressed in 
different tissues. It is formed by two polypeptides: alpha that interacts with IGF-I and beta 
involved in signal transduction and also present in the insulin receptor (50,51). Thus, IGF-I and 

20 insulin activation are similar. 

IGF-I is a potent multifunctional anabolic hormone produced in the liver upon 
stimulation by growth hormone (GH). In liver cirrhosis the reduction of receptors for GH in 
hepatocytes and the diminished synthesis of the liver parenchyma cause a progressive fall of 
serum IGF-I levels. Patients with liver cirrhosis have a number of systemic dsrrangements such 

25 as muscle atrophy, osteopenia, hypogonadism, protein-calorie malnutrition which could be 
related to reduced levels of circulating IGF-L Recent studies from Prieto's laboratory have 
demonstrated that treatments with low doses of IGF-I induce significant improvements in 
nutritional status (52), intestinal absorption (53-55), osteopenia (56), hypogonadism (57) and 
liver function (58) in rats with experimental liver cirrhosis. These data support that IGF-I 

30 deficiency plays a pathogenic role in several systemic complications occurring in liver cirrhosis. 
The liver can be considered as an endocrine gland synthesising a hormone such as IGF-I with 
important physiological functions. Thus liver cirrhosis should be viewed as a disease 
accompanied by a hormone deficiency syndrome for which replacement therapy with IGF-I is 
warranted. Clinical studies are in progress to ascertain the role of IGF-I in the management of 

35 cirrhotic patients. IGF-I is also being currently used for Laron dwarfism treatment. These 
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liver GH receptor so IGF-I is not expressed (59). Also IGF-I, acting as a 
hypoglycemia^ is given together with insulin in diabetes mellitus (60,61). Anabolic effects of 
IGF-I are used in osteoporosis treatment (62,63) hypercatabolism and starvation due to burning 
and HIV infection (64,65). Unpublished studies indicate that IGF-I could also be used in patients 
5 with articular degenerative disease (osteoarthritis). 

The potency of IGF-I has encouraged a great number of scientists to try IGF-I expression 
in various microorganisms due to the small amount present in human plasma. Production of IGF- 
I in yeast was shown to have several disadvantages like low fermentation yields and risks of 
obtaining, undesirable glycosylation in these molecules (66). Expression in bacteria has been the 

10 most successful approach, either as a secreted form fused to protein leader sequences (67) or 
fused to a solubilized affinity fusion protein (68). In addition, IGF-I has been produced as 
insoluble inclusion bodies fused to protective polypeptides (69). Sun-Ok Kim and Young Lee 
(70a) expressed IGF-I as a truncated beta-galactosidase fusion protein. The final purification 
yielded approximately 5 mg of IGF-I having native conformation per liter of bacterial culture. 

1 5 IGF-I has also been expressed in animals. Zinovieva et al. (70b) reported an expression of 0.543 
mg/ml in rabbit milk. 

IGF-I circulates in plasma in a fairly high concentration varying between 120-400 ng/ml. 

In cirrhotic patients the values of IGF-I fall to 20 ng/ml and frequently to undetectable levels. 

Replacement therapy with IGF-I in liver cirrhosis requires administration of 1 .5-2 mg per day for 
20 each patient Thus, every cirrhotic patient will consume about 600 mg per year. IGF-I is 

currently produced in bacteria (71). The high amount of recombinant protein needed for IGF-I 

replacement therapy in patients with liver cirrhosis will make this treatment exceedingly 

expensive if new methods for cheap production of recombinant proteins are not developed. 

Besides, as described above, IGF-I is used in treatment of dwarfism, diabetes, osteoporosis, 
25 starvation and hypercatabolism. IGF-I use in osteoarthritis is currently being investigated. 

Again, plant biotechnology could provide a solution to make economically feasible the 

application of IGF-I therapy to all these patients. 



30 SUMMARY OF THE INVENTION 

The present invention develops recombinant DNA vectors for enhanced expression of 
human serum albumin, insulin-like growth factor I, and interferon-D 2 and 5, via 
chloroplast genomes of tobacco, 
optimizes processing and purification of pharmaceutical proteins using chloroplast vectors in E. 
35 coli, and 
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w ;nic tobacco plants. 

The transgenic expression of proteins or fusion proteins is characterized using molecular and 

biochemical methods in chloroplasts. 
Existing or modified methods of purification are employed on transgenic leaves. 
5 Mendelian or maternal inheritance of transgenic plants is analyzed. 

Large scale purification of therapeutic proteins from transgenic tobacco and comparison of 

current purification methods in E.coli or yeast is performed, and 
natural refolding in chloroplasts is compared with existing in vitro processing methods; 
Comparison/characterization (yield and purity) of therapeutic proteins produced in yeast or 
1 0 E.coli with transgenic tobacco chloroplasts is performed, as are 

In vitro and in vivo (pre-clinical trials) studies of protein biofunctionality. 

DETAILED DESCRIPTION OF THE INVENTION 

When the concept of chloroplast genetic engineering was developed (72,73), it was 

1 5 possible to introduce isolated intact chloroplasts into protoplasts and regenerate transgenic plants 
(74). Therefore, early investigations on chloroplast transformation focused on the development 
of in organello systems using intact chloroplasts capable of efficient and prolonged transcription 
and translation (75-77) and expression of foreign genes in isolated chloroplasts (78). However, 
after the discovery of the gene gun as a transformation device (79), it was possible to transform 

20 plant chloroplasts without the use of isolated plastids and protoplasts. Chloroplast genetic 
engineering was accomplished in several phases. Transient expression of foreign genes in 
plastids of dicots (80,81) was followed by such studies in monocots (82). Unique to the 
chloroplast genetic engineering is the development of a foreign gene expression system using 
autonomously replicating chloroplast expression vectors (80). Stable integration of a selectable 

25 marker gene into the tobacco chloroplast genome (83) was also accomplished using the gene 
gun. However, useful genes conferring valuable traits via chloroplast genetic engineering have 
been demonstrated only recently. For example, plants resistant to B.t. sensitive insects were 
obtained by integrating the crylAc gene into the tobacco chloroplast genome (84). Plants resistant 
to B.t. resistant insects (up to 40,000 fold) were obtained by hyper-expression of the cry 2 A gene 

30 within the tobacco chloroplast genome (85). Plants have also been genetically engineered via the 
chloroplast genome to confer herbicide resistance and the introduced foreign genes were 
maternally inherited, overcoming the problem of out-cross with weeds (86). Chloroplast genetic 
engineering technology is currently being applied to other useful crops (73,87). 

A remarkable feature of chloroplast genetic engineering is the observation of 

35 exceptionally large accumulation of foreign proteins in transgenic plants, as much as 46% of 
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n total soluble protein, even in bleached old leaves (3). Stable expression of a 
pharmaceutical protein in chloroplasts was first reported for GVGVP, a protein based polymer 
with varied medical applications (such as the prevention of post-surgical adhesions and scars, 
wound coverings, artificial pericardia, tissue reconstruction and programmed drug delivery (88)). 
5 Subsequently, expression of the human somatotropin via the tobacco chloroplast genome (9) to 
high levels (7% of total soluble protein) was observed. The following investigations that are in 
progress in the Daniell laboratory illustrate the power of this technology to express small 
peptides, entire operons, vaccines that require oEgomeric proteins with stable disulfide bridges 
and monoclonals that require assembly of heavy/light chains via chaperonins. 

10 In plant and animal cells, nuclear mRNAs are translated monocistronically. This poses a 

serious problem when engineering multiple genes in plants (91). Therefore, in order to express 
the polyhydroxybutyrate polymer or Guy's 13 antibody, single genes were first introduced into 
individual transgenic plants, then these plants were back-crossed to reconstitute the entire 
pathway or the complete protein (92,93). Similarly, in a seven year long effort, Ye et al. (81) 

15 recently introduced a set of three genes for a short biosynthetic pathway that resulted in p- 
carotene expression in rice. In contrast, most chloroplast genes of higher plants are cotranscribed 
(91). Expression of polycistrons via the chloroplast genome provides a unique opportunity to 
express entire pathways in a single transformation event. The Bacillus thuringiensis (Bt) 
cry2Aa2 operon has recently been used as a model system to demonstrate operon expression and 

20 crystal formation via the chloroplast genome (3). CrylAal is the distal gene of a three-gene 
operon. The o^immediatery upstream of cry2As2 codes for a putative chaperonin that facilitates 
the folding of cry2Aa2 (and other proteins) to form proteolyticalry stable cuboidal crystals (94). 

Therefore, the cry2Aa2 bacterial operon was expressed in tobacco chloroplasts to test the 
resultant transgenic plants for increased expression and improved persistence of the accumulated 

25 insecticidal protein(s). Stable foreign gene integration was confirmed by PGR and Southern blot 
analysis in To and Ti transgenic plants. Cry2Aa2 operon derived protein accumulated at 45.3% of 
the total soluble protein in mature leaves and remained stable even in old bleached leaves 
(46.l%)(Figure 1). This is the highest level of foreign gene expression ever reported in 
transgenic plants. Exceedingly difficult to control insects (10-day old cotton bollworm, 

30 beetarmy worm) were killed 100% after consuming transgenic leaves. Electron micrographs 
showed the presence of the insecticidal protein folded into cuboidal crystals similar in shape to 
Cry2Aa2 crystals observed in Bacillus thuringiensis (Figure 2). In contrast to currently marketed 
transgenic plants with soluble CRY proteins, folded protoxin crystals will be processed only by 
target insects that have alkaline gut pH; this approach should improve safety of Bt transgenic 

35 plants. Absence of insecticidal proteins in transgenic pollen eliminates toxicity to non-target 
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Hen. In addition to these environmentally friendly approaches, this observation 
should serve as a model system for large-scale production of foreign proteins within chloroplasts 
in a folded configuration enhancing their stability and facilitating single step purification. This is 
the first demonstration of expression of a bacterial operon in transgenic plants- and opens the 
5 door to engineer novel pathways in plants in a single transformation event. 

It is common knowledge that the medical community has been fighting a vigorous battle 
against drug resistant pathogenic bacteria for years. Cationic antibacterial peptides from 
mammals, amphibians and insects have gained more attention over the last decade (95). Key 
features of these cationic peptides are a net positive charge, an affinity for negatively-charged 

1 0 prokaryotic membrane phospholipids over neutral-charged eukaryotic membranes and the ability 
to form aggregates that disrupt the bacterial membrane (96). 

There are three major peptides with a-helical structures, cecropin from Hyalophora 
cecropia (giant silk moth), magainins from Xenopus laevis (African frog) and defensins from 
mammalian neutrophils. Magainin and its analogues have been studied as a broad-spectrum 

1 5 topical agent, a systemic antibiotic; a wound-healing stimulant; and an anticancer agent (97). We 
have recently observed that a synthetic lytic peptide (MSI-99, 22 amino acids) can be 
successfully expressed in tobacco chloroplast (98). The peptide retained its lytic activity against 
the phytopathogenic bacteria Psendomonas syringae and multidrug resistant human pathogen, 
Pseudomonas aeruginosa. The anti-microbial peptide (AMP) used in this study was an 

20 amphipathic alpha-helix molecule that has an affinity for negatively charged phospholipids 
commonly found in the outer-membrane of bacteria. Upon contact with these membranes, 
individual peptides aggregate to form pores in the membrane, resulting in bacterial lysis. 
Because of the concentration dependent action of the AMP, it was expressed via the chloroplast 
genome to accomplish high dose delivery at the point of infection. PCR products and Southern 

25 blots confirmed chloroplast integration of the foreign genes and homoplasmy. Growth and 
development of the transgenic plants was unaffected by hyper-expression of the AMP within 
chloroplasts. In vitro assays with T 0 and Ti plants confirmed that the AMP was expressed at 
high levels (21.5 to 43% of the total soluble protein) and retained biological activity against 
Pseudomonas syringae, a major plant pathogen. In situ assays resulted in intense areas of 

30 necrosis around the point of infection in control leaves, while transformed leaves showed no 
signs of necrosis (200-800 jxg of AMP at the site of infection)(Figure 3). Ti in vitro assays 
against Pseudomonas aeruginosa (a multi-drug resistant human pathogen) displayed a 96% 
inhibition of growth (Figure 4). These results give a new option in the battle against 
phytopathogenic and drug-resistant human pathogenic bacteria. Small peptides (like insulin) are 
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tost organisms. However, stability of this AMP in chioroplasts opens up this 
compartment for expression of hormones and other small peptides. 

Expression of cholera toxin p subunit oligomers as a vaccine in chioroplasts 

5 Vibrio cholerae, which causes acute watery diarrhea by colonizing the small intestine and 

producing the enterotoxin, cholera toxin (CT). Cholera toxin is a hexameric AB5 protein 
consisting of one toxic 27kDa A subunit having ADP ribosyl transferase activity and a nontoxic 
pentarner of 1 1.6 kDa B subunits (CTB) that binds to the A subunit and facilitates its entry into 
the intestinal epithelial cells. CTB when administered orally (99) is a potent mucosal irnrnunogen 

10 which can neutralize the toxicity of the CT holotoxin by preventing it from binding to the 
intestinal cells (100). This is believed to be a result of it binding to eukaryotic cell surfaces via 
the Gmi gangliosides, receptors present on the intestinal epithelial surface, thus eliciting a 
mucosal immune response to pathogens (101) and enhancing the immune response when 
chemically coupled to other antigens (102-105). 

15 Cholera toxin (CTB) has previously been expressed in nuclear transgenic plants at levels 

of 0.01 (leaves) to 0.3% (tubers) of the total soluble protein. To increase expression levels, we 
engineered the chloroplast genome to express the CTB gene (10). We observed expression of 
oligomeric CTB at levels of 4-5% of total soluble plant protein (Figure 5 A). PCR and Southern 
Blot analyses confirmed stable integration of the CTB gene into the chloroplast genome. Western 

20 blot analysis showed that transgenic chloroplast expressed CTB was antigenically identical to 
commercially available purified CTB antigen (Figure 6). Also, GMl-gang^oside binding assays 
confirm that chloroplast synthesized CTB binds to the intestinal membrane receptor of cholera 
toxin (Figure 5B), Transgenic tobacco plants were morphologically indistinguishable from 
untransformed plants and the introduced gene was found to be stably inherited in the subsequent 

25 generation as confirmed by PCR and Southern Blot analyses. The increased production of an 
efficient transmucosal carrier molecule and delivery system, like CTB, in chioroplasts of plants 
makes plant based oral vaccines and fusion proteins with CTB needing oral adrninistration, a 
much more feasible approach. This also establishes unequivocally that chioroplasts are capable 
of forming disulfide bridges to assemble foreign proteins. 

30 

Expression and assembly of monoclonals in transgenic chioroplasts 

Dental caries (cavities) is probably the most prevalent disease of humankind. 
Colonization of teeth by S. mutans is the single most important risk factor in the development of 
dental caries. SL mutans is a non-motile, gram positive coccus. It colonizes tooth surfaces and 
35 synthesizes glucans (insoluble porysacch?-^ — 1 fructans from sucrose using the enzymes 
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rase and fructosyltransferase respectively (106a). The glucans play an important 
role by allowing the bacterium to adhere to the smooth tooth surfaces. After its adherence, the 
bacterium ferments sucrose and produces lactic acid. Lactic acid dissolves the minerals of the 
tooth, producing a cavity. 

5 A topical monoclonal antibody therapy to prevent adherence of S. mutatis to teeth has 

recently been developed. The incidence of cariogenic bacteria (in humans and animals) and 
dental caries (in animals) was dramatically reduced for periods of up to two years after the 
cessation of the antibody therapy. No adverse events were detected either in the exposed animals 
or in human volunteers (106b). The annual requirement for this antibody in the US alone may 

10 eventually exceed 1 metric ton. Therefore, this antibody was expressed via the chloroplast 
genome to achieve higher levels of expression and proper folding (11). The integration of 
antibody genes into the chloroplast genome was confirmed by PCR and Southern blot analysis. 
The expression of both heavy and light chains was confirmed by western blot analysis under 
reducing conditions (Figure 7A,B). The expression of fully assembled antibody was confirmed 

15 by western blot analysis under non-reducing conditions (Figure 7C). This is the first report of 
successful assembly of a multi-subunit human protein in transgenic chloroplasts. Production of 
monoclonal antibodies at agricultural level should reduce their cost and create new applications 
of monoclonal antibodies. 

HUMAN SERUM ALBUMIN 

20 Nuclear transformation 

The human HSA cDNA was cloned from human liver cells and the patatin promoter 
(whose expression is tuber specific (107)) fused along with the leader sequence of PIN II 
(proteinase II inhibitor potato transit peptide mat directs HSA to the apoplast (108)). Leaf discs 
of Desiree and Kennebec potato plants were transformed using Agrobacterium tumefaciens. A 

25 total of 98 transgenic Desiree clones and 30 Kennebec clones were tested by PCR and western 
blots. Western blots showed that the recombinant albumin (rHSA) had been properly cleaved by 
the proteinase II inhibitor transit peptide (Figure 8). Expression levels of both cultivars were very 
different among all transgenic clones as expected (Figure 9), probably because of position effects 
and gene silencing (89,90). The population distribution was similar in both cultivars: majority of 

30 transgenic clones showed expression levels between 0.04 and 0.06% of rHSA in the total soluble 
protein. The maximum recombinant HSA amount expressed was 0.2%. Between one and five T- 
DNA insertions per tetraploid genome were observed in these clones. Plants with higher protein 
expression were always clones with several copies of the HSA gene. Levels of inRNA were 
analyzed by Northern blots. There was a correlation between transcript levels and recombinant 

35 albumin accumulation in transgenic tubers. The N-terminal sequence showed proper cleavage of 
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dde and the amino terminal sequence between recombinant and human USA was 
identical. Inhibition of patatin expression using the antisense technology did not improve the 
amount of rHSA. Average expression level among 29 transgenic plants was 0.032% of total 
soluble protein, with a maximum expression of 0.1%. 
5 Transformation of the tobacco chloroplast genome was initiated for hyperexpression of 

HSA. The codon composition is ideal for chloroplast expression and no changes in nucleotide 
sequences were necessary. For all the constructs pLD vector was used. Several vectors were 
designed to optimize HSA expression. All these contained ATG as the first amino acid of the 
mature protein. 
10 RBS-ATG-HSA 

The first vector included the gene that codes for the mature HSA plus an additional ATG 
as a translation initiation codon. We included the ATG in one of the primers of the PCR, 5 
nucleotides downstream of the chloroplast preferred RBS sequence GGAGG. The cDNA 
sequence of the mature HSA (cloned in Dr. Mingo-CastePs laboratory) was used as a template. 
15 The PCR product was cloned into PCR 2.1 vector, excised as an EcoRI-NotI fragment and 
introduced into the pLD vector. 
5'UTRpsbA-ATG-HSA 

The 200 bp tobacco chloroplast DNA fragment containing the 5 5 psbA UTR was 
amplified using PCR and tobacco DNA as template. The fragment was cloned into PCR 2.1 
20 vector, excised EcoRI-Ncol fragment was inserted at the Ncol site of the ATG-HSA and finally 
inserted into the pLD vector as an EcoRI-NotI fragment downstream of the 16S rRNA promoter 
to enhance translation of the protein. 
BtORFl+2-ATG-HSA 

ORF1 and ORF2 of the Bt Cry2Aa2 operon were amplified in a PCR using the complete 
25 operon as a template. The fragment was cloned into PCR 2.1 vector, excised as an EcoRI-EcoRV 
fragment, inserted at EcoRV site with the ATG-HSA sequence and introduced into the pLD 
vector as an EcoRI-NotI fragment. The ORF1 and ORF2 were fused upstream of the ATG-HSA. 

Because of the similarity of protein synthetic machinery (109), expression of all 
chloroplast vectors was first tested in E.coli before their use in tobacco transformation. Different 
30 levels of expression were obtained in E. coli depending on the construct (Figure 10). Using the 
psbA 5' UTR and the ORF1 and ORF2 of the ay2Aa2 operon, we obtained higher levels of 
expression than using only the RBS. We have observed in previous experiments that HSA in E. 
coli is completely insoluble (as is shown in ref 14), probably due to an improper folding resulting 
from the absence of disulfide bonds. This is the reason why the protein is precipitated in the gel 
35 (Figure 10). Different polypeptide sizes were observed, probably due to incomplete translation. 
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E. coli and chloroplast have similar protein synthesis machinery, one could expect 
different levels of expression in transgenic tobacco chloroplasts depending on the regulatory 
sequences, with the advantage that disulfide bonds are formed in chloroplasts (9). These three 
vectors were bombarded into tobacco leaves via particle bombardment (110) and after 4 weeks 
5 small shoots appeared as a result of independent transformation events. 

INTERFERON- □ 5 

Interferon-D5 has not been expressed yet as a commercial recombinant protein. The first 
attempt has been made recently. The IFN-D5 gene was cloned and the sequence of the mature 

10 protein was inserted into the pET28 vector, that included the ATG, histidine tag for purification 
and thrombin cleavage sequences. The tagged IFN-D5 was purified first by binding to a nickel 
column and biotinylated thrombin was then used to eliminate the tag on IFN-D □. Biotinylated 
thrombin was removed from the preparation using streptavidin agarose. The expression level was 
5.6 micrograms per liter of broth culture and the recombinant protein was active in antiviral 

1 5 activity similar or higher than commercial IFN-D2 (Intron A, Schering Plouth). 

Insulin-like Growth Factor-I (IGF-H 

Recent studies have demonstrated that treatment with low doses of IGF-I induced 
significant improvements in nutritional status (52), intestinal absorption (53-55), osteopenia (56), 
20 hypogonadism (57) and liver function (58) in rats with experimental liver cirrhosis. These data 
su PP°rt that IGF-I deficiency plays a pathogenic role in several systemic complications occurring 
in liver cirrhosis. Clinical studies are in progress to ascertain the role of IGF-I in the 
management of cirrhotic patients. Unpublished studies indicate that IGF-I could also be used in 
patients with articular degenerative disease (osteoarthritis). 

25 

Experimental 
Example 1 

Evaluation of chloroplast gene expression 

A systematic approach is used to identify and overcome potential' limitations of foreign 

30 gene expression in chloroplasts of transgenic plants. This experiment increases the utility of 
chloroplast transformation system by scientists interested in expressing other foreign proteins. 
Therefore, it is important to systematically analyze transcription, RNA abundance, RNA 
stability, rate of protein synthesis and degradation, proper folding and biological activity. The 
rate of transcription of the introduced HSA gene is compared with the highly expressing 

35 endogenous chloroplast genes (rbcL, psbA, 16S rRNA), using run on transcription assays to 
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le 16SrRNA promoter is operating as expected. The transcription efficiency of 
transgenic chloroplast containing each of the three constructs with different 5' regions is tested. 
Similarly, transgene RNA levels are monitored by northerns, dot blots and primer extension 
relative to endogenous rbcL, 16S rRNA or psbA. These results,- along with run on transcription 
5 assays, provide valuable information of RNA stability, processing, etc. RNA appears to be 
extremely stable based on northern blot analysis. This systematic study is valuable to advance 
utility of this system by other scientists. Most importantly, the efficiency of translation is tested 
in isolated chloroplasts and compared with the highly translated chloroplast protein (psbA). 
Pulse chase experiments help assess if translational pausing, premature termination occurs. 
1 0 Evaluation of percent RNA loaded on polysomes or in constructs with or without 5 ! UTRs helps 
to determine the efficiency of the ribosome binding site and 5' stem-loop translational enhancers. 
Codon optimized genes (IGF-I, IFN) are compared with unmodified genes to investigate the rate 
of translation, pausing and termination. A 200-fold difference in accumulation of foreign 
proteins due to decreases in proteolysis conferred by a putative chaperonin (3) was observed. 
1 5 Therefore, proteins from constructs expressing or not expressing the putative chaperonin (with or 
without ORF 1+2) provide valuable information on protein stability. 

Example 2 

Expression of the mature protein 

HSA, Interferon and IGF-I are pre-proteins that need to be cleaved to secrete mature 
proteins. The codon for translation initiation is in the presequence. In chloroplasts, the necessity 
of expressing the mature protein forces introduction of this additional amino acid in coding 
sequences. In order to optimize expression levels, we first subclone the sequence of the mature 
proteins beginning with an ATG. Subsequent immunological assays in mice demonstrates the 
extra-methionine causes immunogenic response and low bioactrvity. Alternatively, systems may 
also produce the mature protein. These systems can include the synthesis of a protein fused to a 
peptide that is cleaved intracellulary (processed) by chloroplast enzymes or the use of chemical 
or enzymatic cleavage after partial purification of proteins from plant cells. 

30 Use of peptides that are cleaved in chloroplast 

Staub et al. (9) reported chloroplast expression of human somatotropin similar to the 
native human protein by using ubiquitin fusions that were cleaved in the stroma by an ubiquitin 
protease. However, the processing efficiency ranged from 30-80% and the cleavage site was not 
accurate. In order to process chloroplast expressed proteins a peptide which is cleaved in the 
35 stroma is essential. The transit peptide sequence of the RuBisCo (ribulose 1,5-bisphosphate 
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mall subunit is an ideal choice. This transit peptide has been studied in depth 
(111). RuBisCo is one of the proteins that is synthesized in cytoplasm and transported 
postranslationally into the chloroplast in an energy dependent process. The transit peptide is 
proteolytically removed upon transport in the stroma by the stromal processing peptidase (112). 

5 There are several sequences described for different species (113). A transit peptide consensus 
sequence for the RuBisCo small subunit of vascular plants is published by Keegstra et al. (1 14). 
The amino acids that are proximal to the C-tenninal (41-59) are highly conserved in the higher 
plant transit sequences and belong to the domain which is involved in enzymatic cleavage (1 1 1). 
The RuBisCo small subunit transit peptide has been fused with various marker proteins 

0 (114,115), even with animal proteins (116,117), to target proteins to the chloroplast. Prior to 
transformation studies, the cleavage efficiency and accuracy are tested by in vitro translation of 
the fusion proteins and in organelle* import studies using intact chloroplasts. Thereafter, 
knowing the correct fusion sequence for producing the mature protein, such sequence encoding 
the amino terminal portion of tobacco chloroplast transit peptide is linked with the mature 

5 sequence of each protein. Codon composition of the tobacco RuBisCo small subunit transit 
peptide is compatible with chloroplast optimal translation (see section d3 and table 1 on page 
30). Additional transit peptide sequences for targeting and cleavage in the chloroplast have been 
described (111). The lumen of thylakoids could also be a good target because thylakoids are 
readily purified. Lumenal proteins can be freed either by sonication or with a very low triton 

D XI 00 concentration, although this requires insertion of additional amino acid sequences for 
efficient import (111). 



Example 3 

Use of chemical or enzymatic cleavage 

The strategy of fusing a protein to a tag with affinity for a certain ligand has been used 
extensively for more than a decade to enable affinity purification of recombinant products (1 18- 
120). A vast number of cleavage methods, both chemical and enzymatic, have been investigated 
for this purpose (120). Chemical cleavage methods have low specificity and the relatively harsh 
cleavage conditions can result in chemical modifications of the released products (120). Some of 
the enzymatic methods offer significantly higher cleavage specificities together with high 
efficiency, e. g. H64A subtilisin, IgA protease and factor Xa (119,120), but these enzymes have 
the drawback of being quite expensive. 

Trypsin, which cleaves C-terminal of basic amino-acid residues, has been used for a long 
time to cleave fusion proteins (14,121). Despite expected low specificity, trypsin has been shown 
to be useful for specific cleavage of fusion proteins, leaving basic residues within folded protein 
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avaged (121). The use of trypsin only requires that the N-terminus of the mature 
protein be accessible to the protease and that the potential internal sites are protected in the 
native conformation. Trypsin has the additional advantage of being inexpensive and readily 
available. In the case of HSA, when it was expressed in E. coli with 6 additional codons coding 
5 for a trypsin cleavage site, HSA was processed successfully into the mature protein after 
treatment with the protease. In addition, the N-terminal sequence was found to be unique and 
identical to the sequence of natural HSA, the conversion was complete and no degradation 
products were observed (14). This in vitro maturation is selective because correctly folded 
albumin is highly resistant to trypsin cleavage at inner sites (14). This system could be tested for 

1 0 chloroplasts HSA vectors using protein expressed in E. coli. 

Staub et al. (9) demonstrated that the chloroplast methionine arninopeptidase is active and 
they found 95% of removal of the first methionine of an ATG-somatotropin protein that was 
expressed via the chloroplast genome. There are several investigations that have shown a very 
strict pattern of cleavage by this peptidase (122). Methionine is only removed when second 

1 5 residues arc glycine, alanine, serine, cysteine, threonine, proline or valine, but if the third amino 
acid is proline the cleavage is inhibited. In the expression of our three proteins we use this 
approach to obtain the mature protein in the case of Interferon because the penultimate 
aminoacid is cystein followed by aspartic acid. For HSA the second aminoacid is aspartic acid 
and for IGF-I glycine but it is followed by proline, so the cleavage is not dependable. 

20 For IGF-I expression, the use of the TEV protease (Gibco cat n 10127-017) would be 

ideal. The cleavage site that is recognized for this protease is Glu-Asn-Leu-Tyr-Phe-Gln-Gly and 
it cuts between Gln-Gly. This strategy allows the release of the mature protein by incubation 
with TEV protease leaving a glycine as the first amino acid consistent with human mature IGF-I 
protein. 

25 The purification system of the E. colt Interferon-05 expression method was based on 6 

Histidine-tags that bind to a nickel column and biotinylated thrombin to eliminate the tag on 
EFN-D5. Thrombin recognizes Leu-Val-Pro-Arg-Gly-Ser and cuts between Arg and Gly. This 
leaves two extra amino acids in the mature protein, but antiviral activity studies have shown that 
this protein is at least as active as commercial IFN-D2. 

30 

Example 4 

Optimization of gene expression 

Foreign genes are expressed between 3% {crylAal) and 47% (cry2Aa2 operon) in 
transgenic chloroplasts (3,85). Based on the outcome of the evaluation of HSA chloroplast 
35 transgenic plants, several approaches can be used to enhance translation of the recombinant 
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loroplasts, transcriptional regulation of gene expression is less important, although 
some modulations by light and developmental conditions are observed (123). RNA stability 
appears to be one among the least problems because of observation of excessive accumulation of 
foreign transcripts, at times 16,966-fold higher than the highly expressing nuclear transgenic 
5 plants (124). Chloroplast gene expression is regulated to a large extent at the post-transcriptional 
level. For example, 5* UTRs are necessary for optimal translation of chloroplast mRNAs. Shine- 
Dalgarno (GGAGG) sequences, as well as a stem-loop structure located 5' adjacent to the SD 
sequence, are required for efficient translation. A recent study has shown that insertion of the 
psbA 5' UTR downstream of the 16S rRNA promoter enhanced translation of a foreign gene 

10 (GUS) hundred-fold (125a). Therefore, the 200-bp tobacco chloroplast DNA fragment (1680- 
1480) containing 5' psbA UTR should be used. This PCR product is inserted downstream of the 
16S rRNA promoter to enhance translation of the recombinant proteins. 

Yet another approach for enhancement of translation is to optimize codon compositions. 
Since all the three proteins are translated in E. coli (see section b), it would be reasonable to 

15 expect efficient expression in chloroplasts. However, optimizing codon compositions to match 
the psbA gene could further enhance the level of translation. Although rbcL (RuBisCO) is the 
most abundant protein on earth, it is not translated as highly as the psbA gene due to the 
extremely high turnover of the psbA gene product. The psbA gene is under stronger selection 
for increased translation efficiency and is the most abundant thylakoid protein. In addition, the 

20 codon usage in higher plant chloroplasts is biased towards the NNC codon of 2-fold degenerate 
groups (i.e. TTC over TTT, GAC over GAT, CAC over CAT, AAC over AAT, ATC over ATT, 
ATA etc.). This is in addition to a strong bias towards T at third position of 4-fold degenerate 
groups. There is also a context effect that should be taken into consideration while modifying 
specific codons. The 2-fold degenerate sites immediately upstream from a GNN codon do not 

25 show this bias towards NNC. (TTT GGA is preferred to TTC GGA while TTC CGT is preferred 
to TTT CGT, TTC AGT to TTT AGT and TTC TCT to TTT TCT)(125b,126). In addition, 
highly expressed chloroplast genes use GNN more frequently that other genes. Codon 
composition was optimized by comparing different species. Abundance of amino acids in 
chloroplasts and tRNA anticodons present in chloroplast must be taken into consideration. We 

30 also compared A+T% content of all foreign genes that had been expressed in transgenic 
chloroplasts in our laboratory with the percentage of chloroplast expression. We found that 
higher levels of A+T always correlated with high expression levels (see table 1). It is also 
possible to modify chloroplast protease recognition sites while modifying codons, without 
affecting their biological functions. 

35 The study of the sequences of HSA, IGF-I and Interferon- □ 5 was done. The HSA 
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ed 57% of A+T content and 40% of the total codons matched with the psbA most 
translated codons. According to the data of table 1, we expected good chloroplast expression of 
the HSA gene without any modifications in its codon composition. EFN-D5 has 54% of A+T 
content and 40% of matching with psbA codons. The composition seems to be good but this 
5 protein is small (166 amino acids) and the sequence was optimized to achieve A+T levels close 
to 65%. Finally, the analysis of the IGF-I sequence showed that the A+T content was 40% and 
only 20% of the codons are the most translated in psbA. Therefore, this gene needed to be 
optimized. Optimization of these two genes is done using a novel PCR approach (127,128) 
which has been successfully used to optimize codon composition of other human proteins. 

10 

Example 5 

Vector constructions 

For all the constructs pLD vector is used. This vector was developed in this laboratory for 
chloroplast transformation. It contains the 16S rRNA promoter (Prrn) driving the selectable 

1 5 marker gene aadA (aminoglycoside adenyl transferase coriferring resistance to spectinomycin) 
followed by the psbA 3' region (the terminator from a gene coding for photosystem II reaction 
center components) from the tobacco chloroplast genome. Hie pLD vector is a universal 
chloroplast expression /integration vector and can be used to transform chloroplast genomes of 
several other plant species (73,86) because these flanking sequences are higfily conserved among 

20 higher plants. The universal vector uses trnA and trnl genes (chloroplast transfer RNAs coding 
for Alanine and Isoleucine) from the inverted repeat region of the tobacco chloroplast genome as 
flanking sequences for homologous recombination. Because the universal vector integrates 
foreign genes within the Inverted Repeat region of the chloroplast genome, it should double the 
copy number of the transgene (from 5000 to 10,000 copies per cell in tobacco). Furthermore, it 

25 has been demonstrated that homopiasmy is achieved even in the first round of selection in 
tobacco probably because of the presence of a chloroplast origin of replication within the 
flanking sequence in the universal vector (thereby providing more templates for integration). 
Because of these and several other reasons, foreign gene expression was shown to be much 
higher when the universal vector was used instead of the tobacco specific vector (88). 

30 

The following vectors are used to optimize protein expression, purification and production of 
proteins with the same amino acid composition as in human proteins. 

a) In order to optimize expression, translation is increased using the psbA 5'UTR and 
35 optimizing the codon composition for protein expression in chloroplasts according to criteria 
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deviously. The 200 bp tobacco chloroplast DNA fragment containing 5' psbA 
UTR is amplified by PCR using tobacco chloroplast DNA as template. This fragment is 
cloned directly in the pLD vector multiple cloning site (EcoRI-Ncol) downstream of the 
promoter and the aadA gene. The cloned sequence is exactly the same as in the psbA gene. 

5 

b) For enhancing protein stability and facilitating purification, the crylAal Bacillus 
thuringiensis operon derived putative chaperonin is used* Expression of the'cry2Aa2 operon 
in chloroplasts provides a model system for hyper-expression of foreign proteins (46% of 
total soluble protein) in a folded configuration enhancing their stability and facilitating 

10 purification (3). This justifies inclusion of the putative chaperonin from the cry2Aa2 operon 

in one of the newly designed constructs. In this region there are two open reading frames 
(ORF1 and ORF2) and a ribosomal binding site (rbs). This sequence contains elements 
necessary for Cry2Aa2 crystallization which help to crystallize the HSA, IGF-I and IFN-D 
proteins aiding in the subsequent purification. Successful crystallization of other proteins 

1 5 using this putative chaperonin has been demonstrated (94). We amplify the ORF1 and ORF2 
of the Bt Cry2Aa2 operon by PCR using the complete operon as template. The fragment is 
cloned into a PCR 2.1 vector and excised as an EcoRI-EcoRV product This fragment is then 
cloned directly into the pLD vector multiple cloning site (EcoRI-EcoRV) downstream of the 
promoter and the aadA gene. 

20 

c) To obtain proteins with the same amino acid composition as mature human proteins, we first 
fuse all three genes (codon optimized and native sequence) with the RuBisCo small subunit 
transit peptide. Also other constructions axe done to allow cleavage of the protein after 
isolation from chloroplast These strategies also allow affinity purification of the proteins. 

25 

The first set of constructs includes the sequence of each protein beginning with an ATG, 
introduced by PCR using primers. Processing to get the mature protein may be performed where 
the ATG is shown to be a problem (determined by mice immunological assays). First, we use 
the RuBisCo small subunit transit peptide. This transit peptide is amplified by PCR using 

30 tobacco DNA as template and cloned into the PCR 2.1 vector. All genes are fused with the 
transit peptide using a Mlul restriction site that is introduced in the PCR primers for 
amplification of the transit peptide and genes coding for three proteins. The gene fusions are 
inserted into the pLD vectors downstream of the 5 'UTR or ORF1+2 using the restriction sites 
Ncol and EcoRV respectively. If use of tags or protease sequences is necessary, such sequences 

35 can be introduced by designing primers including these sequences and amplifying the gene with 
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•mpleting vector constructions, all the vectors are sequenced to confirm correct 
nucleotide sequence and in frame fusion. DNA sequencing is done using a Perkin Elmer ABI 
prism 373 DNA sequencing system. 

Because of the similarity of protein synthetic machinery (109), expression of all 
5 chloroplast vectors is first tested in E.coli before their use in tobacco transformation. For 
Escherichia coli expression XL-1 Blue strain is used. E, coli can be transformed by standard 
CaCb transformation procedures and grown in TB culture media. Purification, biological and 
immunogenic assays are done using E. coli expressed proteins. 

10 Example 6 

Bombardment, Regeneration and Characterization of Chloroplast Transgenic Plants 

Tobacco {Nicotiana tabacum var. Petit Havana) plants are grown aseptically by 
germination of seeds on MSO medium. This medium contains MS salts (4.3 g/liter), B5 vitamin 
mixture (myo-inositol, 100 mg/liter; thiamine-HCl, 10 mg/Uter, nicotinic acid, 1 mg/liter; 

15 pyridoxine-HCl, 1 mg/liter), sucrose (30 g/liter) and phytagar (6 g/liter) at pH 5.8. Fully 
expanded, dark green leaves of about two month old plants are used for bombardment. 

Leaves are placed abaxial side up on a Whatman No. 1 filter paper laying on the RMOP 
medium (79) in standard petri plates (100x15 mm) for bombardment Gold (0.6 pm) 
microprojectiles are coated with plasmid DNA (chloroplast vectors) and bombardments are 

20 carried out with the biolistic device PDSIOOO/He (Bio-Rad) as described by Daniell (110). 
Following bombardment, petri plates are sealed with parafilm and incubated at 24°C under 12 h 
photoperiod. Two days after bombardment, leaves are chopped into small pieces of —5 mm 2 in 
size and placed on the selection medium (RMOP containing 500 pg/ml of spectinomycin 
dihydrochloride) with abaxial side touching the medium in deep (100x25 mm) petri plates (-10 

25 pieces per plate). The regenerated spectinomycin resistant shoots are chopped into small pieces 
(~2mm 2 ) and subcloned into fresh deep petri plates (—5 pieces per plate) containing the same 
selection medium. Resistant shoots from the second culture cycle are then transferred to the 
rooting medium (MSO medium supplemented with IB A, 1 mg/liter and spectinomycin 
dihydrochloride, 500 mg/liter). Rooted plants are transferred to soil and grown at 26°C under 16 

30 hour photoperiod conditions for further analysis. 

PCR analysis of putative transformants 

PCR is done using DNA isolated from control and transgenic plants in order to 
distinguish a) true chloroplast transformants from mutants and b) chloroplast transformants from 
35 nuclear transformants. Primers for testing the presence of the aadA gene (that confers 
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resistance) in transgenic plants are landed on the aadA coding sequence and 16S 
rRNA gene. In order to test chloroplast integration of the genes, one primer lands on the aadA 
gene while another lands on the native chloroplast genome. No PCR product is obtained with 
nuclear transgenic plants using this set of primers. The primer set is used to test integration of the 
5 entire gene cassette without any internal deletion or looping out during homologous 
recombination. Similar strategy was used successfully to confirm chloroplast integration of 
foreign genes (3,85-88). This screening is essential to eliminate mutants and nuclear 
transformants. In order to conduct PCR analyses in transgenic plants, total DNA from 
unbombarded and transgenic plants is isolated as described by Edwards et al. (129). Chloroplast 
10 transgenic plants containing the desired gene are then moved to second round of selection in 
order to achieve homoplasmy. 

Southern Analysis for homoplasmy and copy number 

Southern blots are done to determine the copy number of the introduced foreign gene per 

1 5 ceil as well as to test homoplasmy. There are several thousand copies of the chloroplast genome 
present in each plant cell. Therefore, when foreign genes are inserted into the chloroplast 
genome, some of the chloroplast genomes have foreign genes integrated while others remain as. 
the wild type (heteroplasmy). Therefore, in order to ensure that only the transformed genome 
exists in cells of transgenic plants (homoplasmy), the selection process is continued. In order to 

20 confirm that the wild type genome does not exist at the end of the selection cycle, total DNA 
from transgenic plants are probed with the chloroplast border (flanking) sequences (the trnl-trnA 
fragment). When wild type genomes are present (heteroplasmy), the native fragment size is 
observed along with transformed genomes. Presence of a large fragment (due to insertion of 
foreign genes within the flanking sequences) and absence of the native small fragment confirms 

25 homoplasmy (85,86,88). 

The copy number of the integrated gene is determined by establishing homoplasmy for 
the transgenic chloroplast genome. Tobacco chloroplasts contain 5000-10,000 copies of their 
genome per cell (86). If only a fraction of the genomes are actually transformed, the copy 
number, by de&ult, must be less than 10,000. By establishing that in the transgenics the gene 

30 inserted transformed genome is the only one present, one can establish that the copy number is 
5000-10,000 per cell. This is usually done by digesting the total DNA with a suitable restriction 
enzyme and probing with the flanking sequences that enable homologous recombination into the 
chloroplast genome. The native fragment present in the control should be absent in the 
transgenics. The absence of native fragment proves that only the transgenic chloroplast genome 

35 is present in the cell and there is no native, untransformed, chloroplast genome, without the 
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present. This establishes the homoplasmic nature of our transformants, 
simultaneously providing us with an estimate of 5000-10 3 000 copies of the foreign genes per 
cell. 

5 Northern Analysis for transcript stability 

Northern blots are done to test the efficiency of transcription of the genes. Total RNA is 
isolated from 150 mg of frozen leaves by using the "Rneasy Plant Total RNA Isolation Kit 9 * 
(Qiagen Inc., Chatsworth, CA). RNA (10-40 ug) is denatured by formaldehyde treatment, 
separated on a 1.2% agarose gel in the presence of formaldehyde and transferred to a 
10 nitrocellulose membrane (MSI) as described in Sambrook et al. (130). Probe DNA (proinsulin 
gene coding region) is labeled by the random-primed method (Promega) with 32 P-dCTP isotope. 
The blot is pre-hybridized, hybridized and washed as described above for southern blot analysis. 
Transcript levels are quantified by the Molecular Analyst Program using the GS-700 Imaging 
Densitometer (Bio-Rad, Hercules, CA). 

15 

Expression and quantification of the total protein expressed in chloroplast 

Chloroplast expression assays are done for each protein by Western Blot. Recombinant 
protein levels in transgenic plants are determined using quantitative ELISA assays. A standard 
curve is generated using known concentrations and serial dilutions of recombinant and native 
20 proteins. Different tissues are analyzed using young, mature and old leaves against these primary 
antibodies: goat anti-HSA (Nordic Immunology), anti-IGF-I and anti-Interferon alpha (Sigma). 
Bound IgG is measured using horseradish peroxidase-labelled anti-goat IgG. 

Inheritance of Introduced Foreign Genes 

25 While it is unlikely that introduced DNA would move from the chloroplast genome to 

nuclear genome, it is possible that the gene could get integrated in the nuclear genome during 
bombardment and remain undetected in Southern analysis. Therefore, in initial tobacco 
transformants, some are allowed to self-pollinate, whereas others are used in reciprocal crosses 
with control tobacco (transgenics as female accepters and pollen donors; testing for maternal 

30 inheritance). Harvested seeds (Tl) will be germinated on media containing spectinomycin. 
Achievement of homoplasmy and mode of inheritance can be classified by looking at 
germination results. Homoplasmy is indicated by totally green seedlings (86) while heteroplasmy 
is displayed by variegated leaves (lack of pigmentation, 83). Lack of variation in chlorophyll 
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unong progeny also underscores the absence of position effect, an artifact of 
nuclear transformation. Maternal inheritance is be demonstrated by sole transmission of 
introduced genes via seed generated on transgenic plants, regardless of pollen source (green 
seedlings on selective media). When transgenic pollen is used for pollination of control plants, 
5 resultant progeny do not contain resistance to chemical in selective media (will appear bleached; 
83). Molecular analyses confirm transmission and expression of introduced genes, and T2 seed is 
generated from those confirmed plants by the analyses described above. 

Example 7 
1 0 Purification methods 

The standard method of purification employs classical biochemical techniques with the 
crystallized proteins inside the chloroplast. In this case, the homogenates are passed through 
miracloth to remove cell debris. Centrifugation at 10,000 xg pelletizes all foreign proteins (3). 
Proteins are solubilized using pH, temperature gradient, etc. This is possible if the ORF1 and 2 
15 of the cry2Aa2 operon (see section c) can fold and crystallize the recombinant proteins as 
expected. Were there is no crystal formation, other purification methods must be used (classical 
biochemistry techniques and affinity columns with protease cleavage). 

HSA: Albumin is typically administered in tens of gram quantities. At a purity level of 99.999% 
20 (a level considered sufficient for other recombinant protein preparations), recombinant HSA 
(rHSA) impurities on the order of one mg will still be injected into patients. So impurities from 
the host organism must be reduced to a minimum. Furthermore, purified rHSA must be identical 
to human HSA. Despite these stringent requirements, purification costs must be kept low. To 
purify the HSA obtained by gene manipulation, it is not appropriate to apply the conventional 
25 processes for purifying HSA originating in plasma as such. This is because the impurities to be 
ehmiiiated from rHSA completely differ from those contained in the HSA originating in plasma. 
Namely, rHSA is contaminated with, for example, coloring matters characteristic to recombinant 
HSA, proteins originating in the host cells, polysaccharides, etc. In particular, it is necessary to 
sufficiently eliminate components originating in the host cells, since they are foreign matters for 
30 living organisms including human and can cause the problem of antigenicity. 

In plants two different methods of HSA purification have been done at laboratory scale. 
Sijmons et al. (23) transformed potato and tobacco plants with Agrobacteriuni tumefaciens. For 
the extraction and purification of HSA, 1000 g of stem and leaf tissue was homogenized in 1000 
35 ml cold PBS, 0.6% PVP, 0.1 mM PMSF and 1 mM EDTA The homogenate was clarified by 
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rifuged and the supernatant incubated for 4 h with 1.5 ml polyclonal antiHSA 
coupled to Reactigel spheres (Pierce Chem) in the presence of 0.5% Tween 80. The complex 
HSA-anti HSA-Reactigel was collected and washed with 5 ml 0.5% Tween 80 in PBS. HSA was 
desorbed from the reactigel complex with 2.5 ml of 0.1 M glycine pH 2.5, 10% dioxane, 
5 immediately followed by a buffer exchange with Sephadex G25 to 50 mM Tris pH 8. The 
sample was then loaded on a HR5/5 MonoQ anion exchange column (Pharmacia) and eiuted 
with a linear NaCl gradient (0-350 mM NaCl in 50 mM Tris pH 8 in. 20 min at lml/min). 
Fractions containing the concentrated HSA (at 290 rnM NaCl) were lyophilized and applied to a 
HR 10/30 Sepharose 6 column (Pharmacia) in PBS at 0.3 ml/min. However, this method uses 
10 affinity columns (polyclonal anti-HSA) that are very expensive to scale-up. Also the protein is 
released from the column with 0.1M glycine pH 2.5 that will most probably, denature the 
protein. Therefore, this method can suitably modified to reduce these drawbacks. 

The second method is for HSA extraction and purification from potato tubers (Dr. 

1 5 Mingo-Castel's laboratory). After grinding the tuber in phosphate buffer pH 7.4 (1 mg/2ml), the 
homogenate is filtered in miracloth and centrifuged at 14.000 rpm 15 minutes. After this step 
another filtration of the supernatant in 0.45 Dm filters is necessary. Then, chromatography of 
ionic exchange in FPLC using a DEAE Sepharose Fast Flow column (Amersham) is required. 
Fractions recovered are passed through an affinity column (Blue Sepharose fast flow Amersham) 

20 resulting in a product of high purity. HSA purification based on either method is acceptable. 

IGF-1: All earlier attempts to produce IGF-I in E. colt or Saccharomyces cerevisiae have 
resulted in misfolded proteins. This has made it necessary to perform additional in vitro refolding 
or extensive separation techniques in order to recover the native and biological form of the 

25 molecule. In addition, IGF-I has been demonstrated to possess an intrinsic thermodynamic 
folding problem with regard to quantitatively folding into a native disulfide-bonded 
conformation in vitro (131). Samuelsson et aL (131) and Joly et al. (132) co-expressed IGF-I 
with specific proteins of E. colt that significantly improved the relative yields of correctly folded 
protein and consequently facilitating purification. Samuelsson et al. (132) fused the protein to 

30 affinity tags based on either the IgG-binding domain (Z) from Staphylococcal protein A or the 
two serum albumin domains (ABP) from Streptococcal protein G (134). The fusion protein 
concept allows the IGF-I molecules to be purified by IgG or HSA affinity chromatography. We 
also use this Z tags for protein purification including the double Z domain from S. aureus protein 
and a sequence recognized by TEV protease (see section d\2). The fusion protein is incubated 

35 with an IgG column where binding via the Z domain occurs. Z domain-IgG interaction is very 
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is high affinity, so contaminant proteins can be easily washed off the column. 
Incubation of the column with TEV protease elutes mature IGF-I from the column. TEV protease 
is produced in bacteria in large quantities fused to a 6 histidine tag that is used for TEV 
purification. This tag can be also used to separate IGF-I from contaminant TEV protease. 

5 

IFN-D: In the E. coli expression method used, the purification system was based on using 6 
Histidine-tags that bind to a nickel column and biotinylated thrombin to eliminate the tag on 
IFN-D5. 
Example 8 

1 0 Characterization of the recombinant proteins 

For the safe use of recombinant proteins as a replacement in any of the current 
applications, these proteins must be structurally equivalent and must not contain abnormal host- 
derived modifications. To confirm compliance with these criteria we compare human and 
recombinant proteins using the currently highly sensitive and highly resolving techniques 
1 5 expected by the regulatory authorities to characterize recombinant products (135). 

Amino acid analysis 

Amino acid analysis to confirm the correct sequence is performed following off-line vapour 
phase hydrolysis using ABI 420 A amino acid derivatizer with an on line 130A 

20 phenylthiocarbamyl-amino acid analyzer (Applied Biosystems/ABI). N-terminal sequence 
analysis is performed by Edman degradation using ABI 477 A protein sequencer with an on-line 
120 A phenylthiohydantoin-amino acid analyzer. Automated C-tenninal sequence analysis uses a 
Hewlett-Packard G1009A protein sequencer. To confirm the C-tenninal sequence to a greater 
number of residues, the C-tenninal tryptic peptide is isolated from tryptic digests by reverse- 

25 phase HPLC. 

Protein folding and disulfide bridges formation 

Western blots with reducing and non-reducing gels are done to check protein folding. PAGE 
to visualize small proteins will be done in the presence of tricine. Protein standards (Sigma) are 
30 loaded to compare the mobility of the recombinant proteins. PAGE is performed on PhastGels 
(Pharmacia Biotech). Proteins are blotted and then probed with goat anti-HSA, interferon alpha 
and IGF-I polyclonal antibodies. Bound IgG is detected with horseradish peroxidase-labelled anti 
goat IgG and visualized on X-ray film using ECL detection reagents (Amersham). 



35 Tryptic mapping 
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firm the presence of chloroplast expressed proteins with disulfide linkages 
identical to native human proteins, the samples are subjected to tryptic digestion followed by 
peptide mass mapping using matrix-assisted laser desorption ionization mass spectrometry 
(MALDI-MS). Samples are reduced with dithiothreitol, alkylated with iodoacetamide and then 
5 digested with trypsin comprising three additions of 1:100 enzymeysubstrate over 48h at 37°C. 
Subsequently tryptic peptides are separated by reverse-phase HPLC on a Vydac CI 8 column. 

Mass analysis 

Electrospray mass spectrometry (ESMS) is performed using a VG Quattro electrospray 
10 mass spectrometer. Samples are desalted prior to analysis by reverse-phase HPLC using an 
acetonitrile gradient containing trifluoroacetic acid. 

CD 

Spectra are measured in a nitrogen atmosphere using a Jasco J600 spectropolarimeter. 

15 

Chromatographic techniques 

For HSA, analytical gel-permeation HPLC is performed using a TSK G3000 SWxl 
column. Preparative gel permeation chromatography of HSA is performed using a Sephacryl 
S200 HR column. The monomer fraction, identified by absorbance at 280 nm, is dialyzed and 
20 reconcentrated to its starting concentration. For IGF-I, the reversed-phase chromatography the 
SMART system (Pharmacia Biotech) is used with the mRPC C2/18 SC 2.1/10 column. 

Viscosity 

This is a classical assay for recombinant HSA. Viscosity is a characteristic of proteins 
25 related directly to their size, shape, and conformation. The viscosities of HSA and recombinant 
HSA can be measured at 100 mg. MM in 0.15 M NaCl using a U-tube viscosimeter (M2 type, 
Poulton, Selfe and Lee Ltd, Essex, UK) at 25°C. 

Glycosylatlon 

30 Chloroplast proteins are not known to be glycosylated. However there are no publications 

to confirm or refute this assumption. Therefore glycosylation should be measured using a scaled- 
up version of the method of Ahmed and Furth (136). 



35 



Example 9 
Biological Assays 
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ISA does not have enzymatic activity, it is not possible to run biological assays. 
However, three different techniques can be used to check IGF-I functionality. All of them are 
based on the proliferation of IGF-I responding cells. First, radioactive thymidine uptake can be 
measured in 3T3 fibroblasts, that express IGF-I receptor, as an estimate of DNA synthesis. Also, 
5 a human megakaryoblas tic cell line, HU-3, can be used. As HU-3 grows in suspension, changes 
in cell number and stimulation of glucose uptake induced by IGF-I are assayed using 
AlamarBlue or glucose consumption, respectively. AlamarBlue (Accumed International, 
Westlake.OH) is reduced by mitochondrial enzyme activity. The reduced form of the reagent is 
fluorescent and can be quantitatively detected, with an excitation of 530 nm and an emission of 

1 0 590 nm. AlamarBlue is added to the cells for 24 hours after 2 days induction with different doses 
of IGF-I and in the absence of serum. Glucose consumption by HU-3 cells is then measured 
using a colorimetric glucose oxidase procedure provided by Sigma. HU-3 cells are incubated in 
the absence of serum with different doses of IGF-I Glucose is added for 8 hours and glucose 
concentration is then measured in the supernatant. All three methods to measure IGF-I 

15 functionality are precise, accurate and dose dependent, with a linear range between 0.5 and 50 
ng/rnl (137). 

The method to determine IFN activity is based on their anti-viral properties. This 
procedure measures the ability of IFN to protect HeLa cells against the cytopathic effect of 
encephalomyocarditis virus (EMC). The assay is performed in 96-well microtitre plate. First, 

20 HeLa cells are seeded in the wells and allowed to grow to confluency. Then, the medium is 
removed, replaced with medium containing IFN dilutions, and incubated for 24 hours. EMC 
virus is added and 24 hours later the cytopathic effect is measured. For that, the medium is 
removed and wells are rinsed two times with PBS and stained with methyl violet dye solution. 
The optical density is read at 540 nm. The values of optical density are proportional to the 

25 antiviral activity of IFN (138). Specific activity is determined with reference to standard IFN-D 
(code 82/576) obtained from NIBSC. 

Example 10 

Animal testing and Pre-Clinical Trials 

30 Once albumin is produced at adequate levels in tobacco and the physicochernical 

properties of the product correspond to those of the natural protein, toxicology studies need to be 
done in mice. To avoid mice response to the human protein, transgenic mice carrying HSA 
genomic sequences are used (139). After injection of none, 1, 10, 50 and 100 mg of purified 
recombinant protein s classical toxicology studies are carried out (body weigh and food intake, 

35 animal behavior, piloerection, etc). Albumin can be tested for blood volume replacement after 
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eliminate the fluid from the peritoneal cavity in patients with liver cirrhosis, It has 
been shown that albumin infusion after this maneuver is essential to preserve effective 
circulatory volume and renal function (140). 

IGF-1 and IFND are tested for biological effects in vivo in animal models. Specifically, 
5 woodchucks Qnarmota monax) infected with the woodchuck hepatitis virus (WHV), are widely 
considered as the best animal model of hepatitis B virus infection (141). Preliminary studies have 
shown a significant increase in 5' oligoadenylate synthase RNA levels by real time polymerase 
chain reaction (PCR) in woodchuck peripheral blood mononuclear cells upon incubation with 
human IFND5, a proof of the biological activity of the human IFND5 in woodchuck cells. For in 
1 0 vivo studies, a total of 7 woodchucks chronically infected with WHV (WHV surface antigen and 
WHV-DNA positive in serum) are used: 5 animals are injected subcutaneously with 500,000 
units of human IFND 5 (the activity of human IFND 5 is determined as described previously) 
three times a week for 4 months; the remaining two woodchucks are injected with placebo and 
used as controls. Follow-up includes weekly serological (WHV surface antigen and anti-WHV 
15 surface antibodies by ELISA) and virological (WHV DNA in serum by real time quantitative 
PCR) as well as monthly immunological (T-helper responses against WHV surface and WHV 
core antigens measured by interleukin 2 production from PBMC incubated with those proteins) 
studies. Finally, basal and end of treatment liver biopsies should be performed to score liver 
inflammation and intrahepatic WHV-DNA levels. The final goal of treatment is decrease of viral 
20 replication by WHV-DNA in serum, with secondary end points being histological improvement 
and decrease in intrahepatic WHV-DNA levels. 

For IGF-1, the in vivo therapeutic efficacy is tested in animals in situations of IGF-I 
deficiency such as liver cirrhosis in rats. Several reports (56-58) have been published showing 
that recombinant human IGF-I has marked beneficial effects in increasing bone and muscle 
25 mass, improving liver function and correcting hypogonadism. Briefly, the induction protocol is 
as follows: Liver cirrhosis is induced in rats by inhalation of carbon tetrachloride twice a week 
for 11 weeks, with a progressively increasing exposure time from 1 to 5 minutes per gassing 
session. After the 11 th week, animals continue receiving CCI4 once a week (3 minutes per 
inhalation) to complete 30 weeks of CCU administration. During the whole induction period, 
30 phenobarbital (400 mg/L) is added to drinking water. To test the therapeutic efficacy of tobacco- 
derived IGF-I, cirrhotic rats receive 2 jig/100 g body weight/day of this compound in two 
divided doses, during the last 21 days of the induction protocol (weeks 28, 29, and 30). On day 
22, animals are sacrificed and liver and blood samples collected. The results are compared to 
those obtained in cirrhotic animals receiving placebo instead of tobacco-derived IGF-I, and to 
36 healthy control rats. 
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PRODUCTION OF HUMAN INSULIN IN TRANSGENIC TOBACCO 

FIELD OF THE INVENTION 
This invention relates to production of high value pharmaceutical proteins in nuclear 
transgenic plants, particularly to production of human insulin in transgenic tobacco. 

BACKGROUND 

Research on human proteins in the past years has revolutionized the use of these 
therapeutically valuable proteins in a variety of clinical situations. Since the demand for these 
proteins is expected to increase considerably in the coming years, it would be wise to ensure that in 
the future they will be available in significantly larger amounts, preferably on a cost-effective basis. 
Because most genes can be expressed in many different systems, it is essential to determine which 
system offers the most advantages for the manufacture of the recombinant protein. An ideal 
expression system would be one that produces a maximum amount of safe 5 biologically active 
material at a minimum cost. The use of modified mammalian cells with recombinant DNA 
techniques has the advantage of resulting in products, which are closely related to those of natural 
origin. However, culturing these cells is intricate and can only be carried out on limited scale. 

The use of microorganisms such as bacteria permits manufacture on a larger scale, but 
introduces the. disadvantage of producing products, which differ appreciably from the products of 

20 natural origin. For example, proteins that are usually glycosylated in humans are not glycosylated 
by bacteria. Furthermore, human proteins that are expressed at high levels in Exoli frequently 
acquire an unnatural conformation, accompanied by intracellular precipitation due to lack of proper 
folding and disulfide bridges. Production of recombinant proteins in plants has many potential 
advantages for generating biopharmaceuticals relevant to clinical medicine. These include the 

25 following: (i) plant systems are more economical than industrial facilities using fermentation 
systems; (ii) technology is available for harvesting and processing plants/ plant products on a large 
scale; (iii) elimination of the purification requirement when the plant tissue containing the 
recombinant protein is used as a food (edible vaccines); (iv) plants can be directed to target proteins 
into stable, intracellular compartments as chlaroplasts, or expressed directly in chloroplasts; (v) the 

30 amount of recombinant product that can be produced approaches industrial-scale levels; and (vi) 
health risks due to contamination with potential human pathogens/toxins are minimized. 

It has been estimated that one tobacco plant should be able to produce more recombinant 
protein than a 300-liter fermenter oiE.coli (Crop Tech, VA). In addition, a tobacco plant can 
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produce a million seeds, facilitating large-scale production. Tobacco is also an ideal choice because 
of its relative ease of genetic manipulation and an impending need to explore alternate uses for this 
hazardous crop. However, with the exception of enzymes (e.g. phytase), levels of foreign proteins 
produced in nuclear transgenic plants are generally low, mostly less than 1% of the total soluble 
5 protein (Kusnadi et al. 1997). May et al. (1996) discuss this problem using the following examples. 
Although plant derived recombinant hepatitis B surface antigen was as effective as a commercial 
recombinant vaccine, the levels of expression in transgenic tobacco were low (0.0066% of total 
soluble protein). Even though Norwalk virus capsid protein expressed in potatoes caused oral 
immunization when consumed as food (edible vaccine), expression levels were low (0.3% of total 
10 soluble protein). 

In particular, expression of human proteins in nuclear transgenic plants has been 
disappointingly low: e.g. human Interferon-p 0.000017% of fresh weight, human serum albumin 
0.02% and erythropoietin 0.0026% of total soluble protein (see Table 1 in Kusnadi et al. 1997). A 
synthetic gene coding for the human epidermal growth factor was expressed only up to 0.001% of 

15 total soluble protein in transgenic tobacco (May et al. 1996). The cost of producing recombinant 
proteins in alfalfa leaves was estimated to be 12-fold lower than in potato tubers and comparable 
with seeds (Kusnadi et al. 1997). However, tobacco leaves are much larger and have much higher 
biomass than alfalfa. Planet Biotechnology has recently estimated that at 50 mg/liter of mammalian 
cell culture or transgenic goat's milk or 50mg/kg of tobacco leaf expression, the cost of purified IgA 

20 will be $10,000, 1000 and 50/g, respectively (Daniell et al. 2000). The cost of production of 
recombinant proteins will be 50-fold lower than that of E.coli fermentation (with 20% expression 
levels in E.coli) (Kusnadi et al. 1997). A decrease in insulin expression from 20% to 5% of biomass 
doubled the cost of production in £.co/i(Petridisetal. 1995). Expression level less than l%oftotal 
soluble protein in plants has been found to be not commercially feasible (Kusnadi et al. 1997). 

25 Therefore, it is important to increase levels of expression of recombinant proteins in plants to exploit 
plant production of pharmacologically important proteins. 

An alternate approach is to express foreign proteins in chloroplasts of higher plants. We have 
recently integrated foreign genes (up to 1 0,000 copies per cell) into the tobacco chloroplast genome 
resulting in accumulation of recombinant proteins up to 46% of the total soluble protein (De Cosa et 

30 al. 2001). Chloroplast transformation utilizes two flanking sequences that, through homologous 
recombination, insert foreign DNA into the spacer region between the functional genes of the 
chloroplast genome, thereby targeting the foreign genes to a precise location. This eliminates the 
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Aposition effect® and gene silencing frequently observed in nuclear transgenic plants. Chloroplast 
genetic engineering is an environmentally friendly approach, minimizing concerns of out-cross of 
introduced traits via pollen to weeds or other crops (Bock and Hagemann 2000, Heifetz 2000), Also, 
the concerns of insects developing resistance to biopesticides are minimized by hyper-expression of 
5 single insecticidal proteins (high dosage) or expression of different types of insecticides in a single 
transformation event (gene pyramiding). Concerns of insecticidal proteins on non-target insects are 
minimized by lack of expression in transgenic pollen (De Cosa et aL 2001). 

Importantly* a significant advantage in the production of pharmaceutical proteins in 
chloroplasts is their ability to process eukaryotic proteins, including folding and formation of 

1 0 disulfide bridges (Drescher et aL 1 998) . Chaperonin proteins are present in chloroplasts (Roy, 1 989; 
Vierling, 1991) that function in folding and assembly of prokaryotic/eukaryotic proteins. Also, 
proteins are activated by disulfide bond oxido/reduction cycles using the chloroplast mioredoxin 
system (Reulland and Miginiac-Maslow, 1999) or chloroplast protein disulfide isomerase (Kim and 
Mayfield, 1997). Accumulation of fully assembled, disulfide bonded form of human somatotropin 

1 5 via chloroplast transformation ,(Staub et al. 2000), oligomeric form of CTB (Henriques and Daniell, 
2000) and the assembly of heavy/light chains of humanized Guy's 13 antibody in transgenic 
chloroplasts (Panchal et al. 2000) provide strong evidence for successful processing of 
pharmaceutical proteins inside chloroplasts. Such folding and assembly should eliminate the need 
for highly expensive in vitro processing of pharmaceutical proteins. For example, 60% of the total 

20 operating cost in the production of human insulin is associated with in vitro processing (formation of 
disulfide bridges and cleavage of methionine, Petridis et aL 1995). 

Another maj or cost of insulin production is purification. Chromatography accounts for 30% 
of operating expenses and 70% of equipment in production of insulin (Petridis et al. 1995). 
Therefore, new approaches are needed to minimize or eliminate chroma-tography in insulin 

25 production. One such approach is the use of GVGVP as a fusion protein to facilitate single step 
purification without the use of chromatography. GVGVP is a Protein Based Polymer (PBP) made 
from synthetic genes. At lower temperatures this polymer exists as more extended molecules. Upon 
raising the temperature above the transition range, polymer hydrophobically folds into dynamic 
structures called p-spirals that further aggregate by hydrophobic association to form twisted 

30 filaments (Urry, 1991 ; Urry et al., 1994). Inverse temperature transition offers several advantages. 
It facilitates scale up of purification from grams to kilograms. Milder purification condition requires 
only a modest change in temperature and ionic strength. This should also facilitate higher recovery, 
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faster purification and high volume processing. Protein purification is generally the slow step 
(bottleneck) in pharmaceutical product development. Through exploitation of this reversible inverse 
temperature transition property, simple and inexpensive extraction and purification may be 
performed. The temperature at which the aggregation takes place can be manipulated by 
5 engineering biopolymers containing varying numbers of repeats and changing salt concentration in 
solution (McPherson et al., 1996). Chloroplast mediated expression of insulin-polymer fusion 
protein should eliminate the need for the expensive fermentation process as well as reagents needed 
for recombinant protein purification and downstream processing. 

Oral delivery of insulin is yet another powerful approach that can eliminate up to 97% of the 
10 production cost of insulin (Petridis et al. 1995). For example, Sun et al. (1994) have shown that 
feeding a small dose of antigens conjugated to the receptor binding non-toxic B subunit moiety of 
the cholera toxin (CTB) suppressed systemic T cell-mediated inflammatory reactions in animals. 
Oral administration of a myelin antigen conjugated to CTB has been shown to protect animals 
against encephalomyelitis, even when given after disease induction (Sim etaL 1996). Bergerotetal. 
15 (1 997) reported that feeding small amounts of human insulin conjugated to CTB suppressed beta cell 
destruction and clinical diabetes in adult non-obese diabetic (NOD) mice. The protective effect 
could be transferred by T cells from CTB-insulin treated animals and was associated with reduced 
insulitis. These results demonstrate that protection against autoimmune diabetes can indeed be 
achieved by feeding small amounts of a pancreas islet cell auto antigen linked to CTB (Bergerot et 
20 al. 1 997). Conjugation with CTB facilitates antigen delivery and presentation to the Gut Associated 
Lymphoid Tissues (GALT) due to its affinity for the cell surface receptor GM]-ganglioside located 
on GALT cells, for increased uptake and immunologic recognition (Arakawa et al. 1998). 
Transgenic potato tubers expressed up to 0. 1% CTB-insulin fusion protein of total soluble protein, 
which retained GMi-ganglioside binding affinity and native autogenicity for both CTB and insulin. 
25 NOD mice fed with transgenic potato tubers containing microgram quantities of CTB-insulin fusion 
protein showed a substantial reduction in insulitis and a delay in the progression of diabetes (Arkawa 
et al. 1998). However, for commercial exploitation, the levels of expression should be increased in 
transgenic plants. Therefore, we propose here expression of CTB-insulin fusion in transgenic 
chloroplasts of nicotine free edible tobacco to increase levels of expression adequate for animal 
30 testing. 

Taken together, low levels of expression of human proteins in nuclear transgenic plants, and 
difficulty in folding, assembly/processing of human proteins in Exoli should make chloroplasts an 
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alternate compartment for expression of these proteins. Production of human proteins in transgenic 
chloroplasts should also dramatically lower the production cost. Large-scale production of insulin in 
tobacco in conjunction with an oral delivery system can be a powerful approach to provide treatment 
to diabetes patients at an affordable cost and provide tobacco farmers alternate uses for this 

5 hazardous crop. Therefore, it is first advantageous to use poly(GVGVP) as a fusion protein to enable 
hyper-expression of insulin and accomplish rapid one step purification of the fusion peptide utilizing 
the inverse temperature transition properties of this polymer. It is further advantageous to develop 
insulin-CTB fusion protein for oral delivery in nicotine free edible tobacco (LAMD 605). 
Both achievements can be accomplished as follows: 

1 0 a) Develop recombinant DNA vectors for enhanced expression of Proinsulin as fusion proteins 
with GVGVP or CTB via chloroplast genomes of tobacco; 

b) Obtain transgenic tobacco (Petit Havana & LAMD 605) plants; 

15 c) Characterize transgenic expression of proinsulin polymer or CTB fusion proteins using 
molecular and biochemical methods in chloroplasts; 

d) Employ existing or modified methods of polymer purification from transgenic leaves; 

20 e) Analyze Mendelian or maternal inheritance of transgenic plants; 

f) Large scale purification of insulin and comparison of current insulin purification methods 
with polymer-based purification method in E.coli and tobacco; 

25 g) Compare natural refolding in chloroplasts with in vitro processing; 

h) Characterization (yield and purity) of proinsulin produced mE.coli and transgenic tobacco; 
and 

30 i) Assessment of diabetic symptoms in mice fed with edible tobacco expressing CTB-insulin 
fusion protein. 
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Diabetes and Insulin: The most obvious action of insulin is to lower blood glucose (Oakly et al. 
1973). This is a result of its immediate effect in increasing glucose uptake in tissues. In muscle, 
under the action of insulin, glucose is more readily taken up and either converted to glycogen and 
5 lactic acid or oxidized to carbon dioxide. Insulin also affects a number of important enzymes 
concerned with cellular metabolism. It increases the activity of glucokinase, which phosphorylates 
glucose, thereby increasing the rate of glucose metabolism in the liver. Insulin also suppresses 
gluconeogenesis by depressing the function of liver enzymes, which operate the reverse pathway 
from proteins to glucose. Lack of insulin can restrict the transport of glucose into muscle and 

10 adipose tissue. This results in increases in blood glucose levels (hyperglycemia). In addition, the 
breakdown of natural fat to free fatty acids and glycerol is increased and there is a rise in the fatty 
acid content in the blood. Increased catabolism of fatty acids by the liver results in greater 
production of ketone bodies. They diffuse from the liver and pass to the muscles for further 
oxidation. Soon, ketone body production rate exceeds oxidation rate and ketosis results. Fewer 

15 amino acids are taken up by the tissues and protein degradation results. At the same time, 
gluconeogenesis is stimulated and protein is used to produce glucose. Obviously, lack of insulin has 
serious consequences. 

Diabetes is classified into types I and II. Type I is also known as insulin dependent diabetes 
mellitus (IDDM). Usually this is caused by a cell-mediated autoimmune destruction of the 
20 pancreatic p-cells (Davidson, 1998). Those suffering from this type are dependent on external 
sources of insulin. Type IE is known as noninsulin-dependent diabetes mellitus (NIDDM). This 
usually involves resistance to insulin in combination with its underproduction. These prominent 
diseases have led to extensive research into microbial production of recombinant human ins nlin 
OHD, 

Expression of Recombinant Human Insulin in KcoRi In 1978, two thousand kilograms of insulin 
were used in the world each year, half of this was used in the United States (Steiner et al., 1978). At 
that time, the number of diabetics in the US was increasing 6% every year (Gunby, 1978). In 1 997 - 
98, 10% increase in sales of diabetes care products and 19% increase in insulin products have been 
reported by Novo Nordisk, making it a 7.8 billion dollar industry. Annually, 160,000 Americans are 
killed by diabetes, making it the fourth leading cause of death. Many methods of production of rffl 
have been developed. Insulin genes were first chemically synthesized for expression in Esherichia 
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coli (Crea et al., 1978). These genes encoded separate insulin A and B chains. The genes were each 
expressed in E.coli as fusion proteins with the p-galactosidase (Goeddel et al., 1979). The first 
documented production of rHI using this system was reported by David Goeddel from Genentech 
(Hall, 1988). The genes were fused to the Trp synthase gene, which resulted in increased insulin 

5 yield, due to the smaller fusion peptide. This fusion protein was approved for commercial 
production by Eli Lilly in 1982 (Chance and Frank, 1993) with a product name of Humulin. As of 
1986, Humulin was produced from proinsulin genes. Proinsulin contains both insulin chains and the 
C-peptide that connects them. Normal in vitro post-translational processing of proinsulin includes 
use of trypsin and carboxypeptidase B for maturation to insulin. Other data concerning commercial 

10 production of Humulin and other insulin products is now considered proprietary information and is 
not available to the public. 

Protein Based Polymers (PBP): The synthetic gene that codes for a bioelastic PBP was designed 
after repeated amino acid sequences GVGVP, observed in all sequenced mammalian elastin proteins 
(Yeh et al. 1987). Elastin is one of the strongest known natural fibers and is present in skin, 
15 ligaments, and arterial walls. Bioelastic PBPs containing multiple repeats of this pentamer have 
remarkable elastic properties, enabling several medical and non-medical applications (Urry et al. 
1993, Urry 1995, Daniell 1995). GVGVP polymers prevent adhesions following surgery, aid in 
reconstructing tissues and delivering drugs to the body over an extended period of time. North 
American Science Associates, Inc. reported that GVGVP polymer is non-toxic in mice, non- 
20 sensitizing and non-antigenic in guinea pigs, and non-pyrogenic in rabbits (Urry et al. 1993). 
Researchers have also observed that inserting sheets of GVGVP at the sites of contaminated wounds 
in rats reduces the number of adhesions that form as the wounds heal (Urry et al. 1 993). In a similar 
manner, using the GVGVP to encase muscles that are cut during eye surgery in rabbits prevents 
scarring following the operation (Urry et al. 1993, Urry 1995). Other medical applications of 
25 bioelastic PBPs include tissue reconstruction (synthetic ligaments and arteries, bones), wound 
coverings, artificial pericardia, catheters and programmed drug delivery (Urry, 1995; Urry et al., 
1993, 1996). 

We have expressed the elastic PBP (GVGVP)i 2 i in E.coli (Guda et al. al. 1995, Brixey et al. 
1997), in the fungus Aspergillus nidulans (Herzog et al. 1 997), in cultured tobacco cells (Zhang et al. 
30 1995), and in transgenic tobacco plants (Zhang et al. 1996). In particular, (GVGVP)i 2 i has been 
expressed to such high levels in E.coli that polymer inclusion bodies occupied up to about 90% of 
the cell volume. Also, inclusion bodies have been observed in chloroplasts of transgenic tobacco 
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plants (see Daniell and Guda, 1997). Recently, we reported stable transformation of the tobacco 
chloroplasts by integration and expression the biopolymer gene (EG121), into the Large Single Copy 
region (5,000 copies per cell) or the Inverted Repeat region (10, 000 copies per cell) of the 
chloroplast genome (Guda et al., 2000). 

5 

PBP as Fusion Proteins: Several systems are now available to simplify protein purification 
including the maltose binding protein (Marina et al. 1988), glutothione S-transferase (Smith and 
Johnson, 1988),biotinylated(Tsaoetal. 1996), thioredoxin (Smith etal. 1998) and cellulose binding 
(Ongetal. 1989) proteins. Recombinant DNA vectors for fusion with short peptides are available to 
10 effectively utilize aforementioned fusion proteins in the purification process (Smith et al. 1988; Kim 
and Raines, 1993; Su et al. 1992). Recombinant proteins are generally purified by affinity 
chromatography, using ligands specific to carrier proteins (Nilsson et al. 1997). While these are 
useful techniques for laboratory scale purification, affinity chromatography for large-scale 
purification is time consuming and cost prohibitive. Therefore, economical and non- 
15 chromatographic techniques are highly desirable. In addition, a common solution to N-terminal 
degradation of small peptides is to fuse foreign peptides to endogenous E.coli proteins. Early in the 
development of this technique, p-galactosidase (0-gal) was used as a fusion protein (Goldberg and 
Goff, 1 986). A drawback of this method was that the [3-gal protein is of relatively high molecular 
weight (MW 100,000). Therefore, the proportion of the peptide product in the total protein is low. 
20 Another problem associated with the large P-gal fusion is early termination of translation (Bumette, 
1983; Hall, 1988). This occurred when P-gal was used to produce human insulin peptides because 
the fusion was detached from the ribosome during translation thus yielding incomplete peptides. 
Other proteins of lower molecular weight proteins have been used as fusion proteins to increase 
peptide production. For example, better yields were obtained with the tryptophan synthase (190aa) 
25 fusion proteins (Hall, 1988, Burnett, 1983). 

One primary advantage of this invention is to use poly(GVGYP) as a fusion protein to enable 
hyper-expression of insulin and accomplish rapid one step purification of the fusion peptide. At 
lower temperatures the polymers exist as more extended molecules which, on raising the temperature 
above the transition range hydrophobically fold into dynamic structures called p-spirals that further 
30 aggregate by hydrophobic association to form twisted filaments (Urry, 199 1). Through exploitation 
of this reversible property, simple and inexpensive extraction and purification is performed. The 
temperature at which aggregation takes place (TO can be manipulated by engineering biopolymers 
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containing varying numbers of repeats or changing salt concentration (McPherson et al., 1996). 
Another group has recently demonstrated purification of recombinant proteins by fusion with 
thermally responsive polypeptides (Meyer and Chilkoti, 1999). Polymers of different sizes have 
been synthesized and expressed in E.coli. This approach can also eliminate the need for expensive 
5 reagents, equipment and time required for purification. 

Cholera Toxin p subunit as a fusion protein: Vibrio cholerae causes diarrhea by colonizing the 
small intestine and producing enterotoxins, of which the cholera toxin (CT) is considered the main 
cause of toxicity. CT is a hexameric AB 5 protein having one 27KDa A subunit which has toxic 
ADP-ribosyl transferase activity and a non-toxic pentamer of 11.6 kDa B subunits that are non- 
10 covalently linked into a very stable doughnut like structure into which the toxic active (A) subunit is 
inserted. The A subunit of CT consists of two fragments - Al and A2 which are linked by a 
disulfide bond. The enzymatic activity of CT is located solely on the Al fragment (Gill, 1976). The 
A2 fragment of the A subunit links the Al fragment and the B pentamer. CT binds via specific 
interactions of the B subunit pentamer with GM1 ganglioside, the membrane receptor, present on the 
1 5 intestinal epithelial ceil surface of the host. The A subunit is then translocated into the cell where it 
ADP-ribosylates the Gs subunit of adenylate cyclase bringing about the increased levels of cyclic 
AMP in affected cells that is associated with the electrolyte and fluid loss of clinical cholera (Lebens 
et al. 1994). For optimal enzymatic activity, the Al fragment needs to be separated from the A2 
fragment by proteolytic cleavage of the main chain and by reduction of the disulfide bond linking 
20 them (Mekalanos et al. 1979). 

Expression and assembly of CTB in transgenic potato tubers has been reported (Arakawa et 
al.1997). The CTB gene including the leader peptide was fused to an endoplasmic reticulum 
retention signal (SEKDEL) at the 3= end to sequester the CTB protein within the lumen of the ER. 
The DNA fragment encoding the 21 -amino acid leader peptide of the CTB protein was retained to 
25 direct the newly synthesized CTB protein into the lumen of the ER. Immunoblot analysis indicated 
that the plant derived CTB protein was antigenically indistinguishable from the bacterial CTB 
protein and that oligorneric CTB molecules (Mr — 50 kDa) were the dominant molecular species 
isolated from transgenic potato leaf and tuber tissues. Similar to bacterial CTB, plant derived CTB 
dissociated into monomers (Mr-15 kDa) during heat/acid treatment 
30 Enzyme linked immunosorbent assay methods indicated that plant synthesized CTB protein 

bound specifically to GM1 gangliosides, the natural membrane receptors of Cholera Toxin. The 
maximum amount of CTB protein detected in auxin induced transgenic potato leaf and tuber tissues 
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was approximately 0.3% of the total soluble protein. The oral immunization of CD-I mice with 
transgenic potato tissues transformed with the CTB gene (administered at weekly intervals for a 
month with a final booster feeding on day 65) has also been reported. The levels of serum and 
mucosal anti-cholera toxin antibodies in mice were found to generate protective immunity against 

5 the cytopathic effects of CT holotoxin. Following intraileal inj ection with CT, the plant immunized 
mice showed up to a 60% reduction in diarrheal fluid accumulation in the small intestine. Systemic 
and mucosal CTB- specific antibody titers were determined in both serum and feces collected from 
immunized mice by the class-specific chemiluminescent ELISA method and the endpoint titers for 
the three antibody isotypes ( IgMJgG and IgA) were determined The extent of CT neutralization in 

10 both Vero cell and ileal loop experiments suggested that anti-CTB antibodies prevent CT binding to 
cellular GMI-gangliosides. Also, mice fed with 3 g of transgenic potato exhibited similar intestinal 
protection as mice gavaged with 30 g of bacterial CTB. Recombinant LTB [rLTB] (the heat labile 
enterotoxin produced by Enterotoxigenic E.colf) which is structurally, functionally and 
immunologically similar to CTB was expressed in transgenic tobacco (Arntzen et al. 1998; Haq et al. 

15 1995). They have reported that, the rLTB retainedits antigenicity as shown by mirnunoprecipitation 
of rLTB with antibodies raised to rLTB from E.colu The rLTB protein was of the right molecular 
weight and aggregated to form the pentamer as confirmed by gel permeation chromatography. 
Delivery of Human Insulin: Insulin has been delivered intravenously in the past several years. 
However, more recently, alternate methods such as nasal spray, are also available. Oral delivery of 

20 insulin is yet another new approach (Mathiowitz et al., 1997). Engineered polymer microspheres 
made of biologically erodable polymers, which display strong interactions with gastrointestinal 
mucus and cellular linings, can traverse both mucosal absorptive epithelium and the follicle- 
associated epithelium, covering the lymphoid tissue of Peyer's patches. Polymers maintain contact 
with intestinal epithelium for extended periods of time and actually penetrate through and between 

25 cells. Animals fed with the poly (FA: PLGA)-encapsulated insulin preparation were able to regulate 
the glucose load better than controls, confirming that insulin crossed the intestinal barrier and was 
released from the microspheres in a biologically active form (Mathiowitz et al., 1997). 

Besides, CTB has also been demonstrated to be an effective carrier molecule for the 
induction of mucosal immunity to polypeptides to which it is chemically or genetically conjugated 

30 (McKenzie et al. 1984; Dertzbaugh et aL 1993). The production of immunomodulatory 
transmucosal carrier molecules, such as CTB, in plants may greatly improve the efficacy of edible 
plant vaccines (Haq et aL 1995; Thanavala et al. 1995; Mason et al. 1996) and may also provide 
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novel oral tolerance agents for prevention of such autoimmune diseases as Type I diabetes (Zhang et 
al. 1991), Rheumatoid arthritis (Trentham et al. 1993) 5 multiple sclerosis {Khouryetal. 1990; Miller 
et al. 1 992; Weiner et al. 1993) as well as the prevention of allergic and allograft rejection reactions 
(Sayeghetal. 1992; Hancock etal. 1993). Therefore, expressing a CTB-proinsulin fusion would be 
5 an ideal approach for oral delivery of insulin. 

Chloroplast Genetic Engineering: When we developed the concept of chloroplast genetic 
engineering (Darnell and McFadden, 1988 U.S. Patents; Daniel!, World Patent, 1999), it was 
possible to introduce isolated intact chloroplasts into protoplasts and regenerate transgenic plants 

10 (Carlson, 1973). Therefore, early investigations on chloroplast transformation focused on the 
development of in organelle) systems using intact chloroplasts capable of efficient and prolonged 
transcription and translation (Daniell and Rebeiz, 1982; Daniell et al., 1983, 1986) and expression of 
foreign genes in isolated chloroplasts (Daniell and McFadden, 1987). However, after the discovery 
of the gene gun as a transformation device (Daniell, 1993), it was possible to transform plant 

15 chloroplasts without the use of isolated plastids and protoplasts. Chloroplast genetic engineering 
was accomplished in several phases. Transient expression of foreign genes in plastids of dicots 
(Daniell et al., 1990; Ye et al., 1990) was followed by such studies in monocots (Daniell et al., 
1991). Unique to the chloroplast genetic engineering is the development of a foreign gene 
expression system using autonomously replicating chloroplast expression vectors (Daniell et al., 

20 1990). Stable integration of a selectable marker gene into the tobacco chloroplast genome (Svab and 
Maliga, 1993) was also accomplished using the gene gun. However, useful genes conferring 
valuable traits via chloroplast genetic engineering have been demonstrated only recently. For 
example, plants resistant to B.t sensitive insects were obtained by integrating the crylAc gene into 
the tobacco chloroplast genome (McBride et al, 1995). Plants resistant to B.t. resistant insects (up 

25 to 40,000 fold) were obtained by hyper-expression of the cryllA gene within the tobacco chloroplast 
genome (Kota et al., 1999). Plants have also been genetically engineered via the chloroplast genome 
to confer herbicide resistance and the introduced foreign genes were maternally inherited, 
overcoming the problem of out-cross with weeds (Daniell et al., 1998). Chloroplast genetic 
engineering has also been used to produce pharmaceutical products that are not used by plants 

30 (Staub et al. 2000, Guda et al. 2000). Chloroplast genetic engineering technology is currently being 
applied to other useful crops (Sidorov et al. 1999; Darnell, 1999). 
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SUMMARY OF INVENTION 
This invention synthesizes high value pharmaceutical proteins in nuclear transgenic plants by 
chloroplast expression for pharmaceutical protein production. Chloroplasts are suitable for this 

5 purpose because of their ability to process eukaryotic proteins, including folding and formation of 
disulfide bridges, thereby eliminating the need for expensive post-purification processing. Tobacco 
is an ideal choice for this purpose because of its large biomass, ease of scale-up (million seeds per 
plant) and genetic manipulation. We use poly(GVGVP) as a fusion protein to enable hyper- 
expression of insulin and accomplish rapid one step purification of fusion peptides utilizing the 

1 0 inverse temperature transition properties of this polymer. We also use insulin-CTB fusion protein in 
chloroplasts of nicotine free edible tobacco (LAMD 605) for oral delivery to NOD mice. 

BRIEF DESCRIPTION OF DRAWINGS 
Fig. 1 shows graphs of Cry2A protein concentration determined by ELISA in transgenic 

15 leaves. 

Fig. 2 is an inmunogold labeled electron microscopy of a mature transgenic leaf. 

Fig. 3 contains photographs of leaves infected with 10 ul of 8x1 0 5 9 8xl0 4 , 8xl0 3 and 8xl0 2 
cells of P. syringae five days after inoculation. 

Fig. 4 is a graph of total plant protein mixed with 5 pi of mid-log phase bacteria from 
20 overnight culture, incubated for two hours at 25 DC at 125 rpm and grown in LB broth overnight. 

Fig. 5 A is a graph of CTB ELISA quantification shown as a percentage of total soluble plant 
protein. 

Fig. 5B is a graph of CTB-GM1 Ganglioside binding ELISA assays. 
Fig. 6 is a 12% reducing PAGE using Chemiluminescent detection with rabbit anti-cholera 
25 serum (1 □) and AP labeled mouse anti-rabbit IgG (2D) antibodies. 

Figs. 7A and B show reducing gels of expression and assembly of disulfide bonded Guy=s 13 
monoclonal antibody. 

Fig. 7C shows a non-reducing gel of expression and assembly of disulfide bonded Guy=s 13 
monoclonal antibody. 

30 Figs. 8A - F show photographs comparing betaine aldehyde and spectinomycin selection. 

Figs. 9 A and B show biopolymer-proinsulin fusion protein expression. 
Fig. 10A shows western blots of biopolymer-proinsulin fusion protein after single step 



WO 01/72959 



145 



PCT/US01/06288 



purification. 

Fig. 10B shows western blots of another biopolymer-proinsuiin fusion protein after single 
step purification. 

Fig. 10C shows western blots of yet another biopolymer-proinsuiin fusion protein after single 
5 step purification. 

Fig. 1 1 shows biopolymer-proinsuiin fusion gene integration into the chloroplast genome 
confirmed by Southern blot analysis. 

DETAILED DESCRIPTION 

1 o A remarkable feature of chloroplast genetic engineering is the observation of exceptionally 

large accumulation of foreign proteins in transgenic plants. This can be as much as 46% of CRY 
protein in total soluble protein, even in bleached old leaves (DeCosa et al. 2001). Stable expression 
of a pharmaceutical protein in chloroplasts was first reported for GVGVP, a protein based polymer 
with varied medical applications (such as the prevention of post-surgical adhesions and scars, wound 

15 coverings, artificial pericardia, tissue reconstruction and programmed drug delivery) (Guda et al. 
2000). Subsequently, expression of the human somatotropin via the tobacco chloroplast genome 
(Staub et al. 2000) to high levels (7% of total soluble protein) was observed. The following 
investigations that are in progress illustrate the power of this technology to express small peptides, 
entire operons, vaccines that require oligomeric proteins with stable disulfide bridges and 

20 monoclonals that require assembly of heavy/light chains via chapexonins. It is essential to develop a 
selection system free of antibiotic resistant genes for the edible insulin approach to be successful. 
One such marker free chloroplast transforaiation system has been accomplished(Daniell et al. 2000). 
Experiments are in progress to develop chloroplast transformation of edible leaves (alfalfa and 
lettuce) for the practical applications of this approach. 

25 

Engineering novel pathways via the chloroplast genome: In plant and animal cells, nuclear 
mRNAs are translated monocistronically . This poses a serious problem when engineering multiple 
genes in plants (Bogorad, 2000). Therefore, single genes were first introduced into individual 
transgenic plants, then these plants were back-crossed to reconstitute the entire pathway or the 
30 complete protein to express the polyhydroxybutyrate polymer or Guy=s 1 3 antibody (Navrath et al. 
1994; Ma et al. 1995). Similarly, in a seven year long effort, Ye et al. (2000) recently introduced a 
set of three genes for a short biosynthetic pathway that resulted in (3-carotene expression in rice. In 
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contrast, most chioroplast genes of higher plants are cotranscribed (Bogorad, 2000). Expression of 
polycistrons via the chioroplast genome provides a unique opportunity to express entire pathways in 
a single transformation event. We have recently used the Bacillus thuringiensis (Bt) cry2Aa2 
operon as a model system to demonstrate operon expression and crystal formation via the chioroplast 

5 genome (De Cosa et al. 2001). Cry2Aa2 is the distal gene of a three-gene operon. The orf 
immediately upstream of crylAsQ. codes for a putative chaperonin that facilitates the folding of 
co>2Aa2 (and other proteins) to form proteolytically stable cuboidal crystals (Ge et al. 1998). 

Therefore, the cry2Aa2 bacterial operon was expressed in tobacco chloroplasts to test the 
resultant transgenic plants for increased expression and improved persistence of the accumulated 

10 insecticidal protein(s). Stable foreign gene integration was confirmed by PCR and Southern blot 
analysis in T 0 and Ti transgenic plants. Cry2Aa2 operon derived protein accumulated at 45.3% of 
the total soluble protein in mature leaves and remained stable even in old bleached leaves (46. 1%) as 
shown in Fig. 1. This is the highest level of foreign gene expression reported in transgenic plants. 
Exceedingly uncontrollable insects (1 0-day old cotton bollworm, beetarmy worm) were killed 100% 

15 after consuming transgenic leaves. Electron micrographs showed the presence of the insecticidal 
protein folded into cuboidal crystals similar in shape to Cry2Aa2 crystals observed in Bacillus 
thuringiensis as shown in Fig. 2. In contrast to currently marketed transgenic plants with soluble 
CRY proteins, folded protoxin crystals are processed only by target insects that have alkaline gut 
pH. This approach improves safety of Bt transgenic plants. Absence of insecticidal proteins in 

20 transgenic pollen eliminates toxicity to non-target insects via pollen. In addition to these 
environmentally friendly approaches, this observation serves as a model system for large-scale 
production of foreign proteins within chloroplasts in a folded configuration enhancing their stability 
and facilitating single step purification. This is the first demonstration of expression of a bacterial 
operon in transgenic plants and opens the door to engineer novel pathways in plants in a single 

25 transformation event. 

Expressing small peptides via the chioroplast genome: It is common knowledge that the medical 
community has been fighting a vigorous battle against drug resistant pathogenic bacteria for years. 
Cationic antibacterial peptides from mammals, amphibians and insects have gained more attention 
30 over the last decade (Hancock and Lehrer, 1998). Key features of these cationic peptides are a net 
positive charge, an affinity for negatively-charged prokaryotic membrane phospholipids over 
neutral-charged eukaryotic membranes and the ability to form aggregates that disrupt the bacterial 
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membrane (Biggin and Sansom, 1999). 

There are three major peptides with a-helical structures, cecropin firom Hyalophor a cecropia 

(giant silk moth), magainins from Xenopus laevis (African frog) and defensins from mammalian 

neutrophils. Magainin and its analogues have been studied as a broad-spectrum topical agent, a 

5 systemic antibiotic; a wound-healing stimulant; and an anticancer agent (Jacob and Zasloff, 1 994). 
We recently observed that a synthetic lytic peptide (MSI-99, 22 amino acids) can be successfully 
expressed in tobacco chloroplast (DeGray et al. 2000). The peptide retained its lytic activity against 
the phytopathogenic bacteria Pseudomonas syringae and multidrug resistant human pathogen, 
Pseudomonas aeruginosa. The anti-microbial peptide (AMP) used in this study was anamphipathic 

10 alpha-helix molecule that has an affinity for negatively charged phospholipids commonly found in 
the outer-membrane of bacteria. Upon contact with these membranes, individual peptides aggregate 
to form pores in the membrane, resulting in bacterial lysis. Because of the concentration dependent 
action of the AMP, it was expressed via the chloroplast genome to accomplish high dose delivery at 
the point of infection. PCR products and Southern blots confirmed chloroplast integration of the 

15 foreign genes and homoplasmy. Growth and development of the transgenic plants was unaffected 
by hyper-expression of the AMP within chloroplasts. In vitro assays with T 0 and Ti plants 
confirmed that the AMP was expressed at high levels (21.5 to 43% of the total soluble protein) and 
retained biological activity against Pseudomonas syringae, a major plant pathogen. In situ assays 
resulted in intense areas of necrosis around the point of infection in control leaves, while 

20 transformed leaves showed no signs of necrosis (200-800 u.g of AMP at the site of infection) as 
shown in Fig. 3 . Ti i n vitro assays against Pseudomonas aeruginosa (a multi-drug resistant human 
pathogen) displayed a 96% inhibition of growth as shown in Fig. 4. These results give a new option 
in the battle against phytopathogenic and drug-resistant human pathogenic bacteria. Small peptides 
(like insulin) are degraded in most organisms. However, stability of this AMP in chloroplasts opens 

25 up this compartment for expression of hormones and other small peptides. 

Expression and assembly of monoclonals in transgenic chloroplasts: Dental caries (cavities) is 
probably the most prevalent disease of humankind. Colonization of teeth by S. mutans is the single 
most important risk factor in the development of dental caries. S. mutans is a non-motile, gram 
30 positive coccus. It colonizes tooth surfaces and synthesizes glucans (insoluble polysaccharide) and 
fructans from sucrose using the enzymes glucosyltransferase and fhictosyltransferase respectively 
(Hotz et al. 1972). The glucans play an important role by allowing the bacterium to adhere to the 
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smooth tooth surfaces. The bacterium ferments sucrose and produces lactic acid after its adherence. 
Lactic acid dissolves the minerals of the tooth } thereby producing a cavity. 

A topical monoclonal antibody therapy to prevent adherence of S. mutans to teeth has 
recently been developed. The incidence of cariogenic bacteria (in humans and animals) and dental 
5 caries (in animals) was dramatically reduced for periods of up to two years after the cessation of the 
antibody therapy. No adverse events were detected either in the exposed animals or in human 
volunteers (Ma et al. 1998). The annual requirement for this antibody in the US alone may 
eventually exceed 1 metric ton. Therefore, this antibody was expressed via the chloroplast genome 
to achieve higher levels of expression and proper folding (Panchal et al. 2000). The integration of 

1 0 antibody genes into the chloroplast genome was confirmed by PCR and Southern blot analysis. The 
expression of both heavy and light chains was confirmed by western blot analysis under reducing 
conditions as shown in Figs. 7A and B. The expression of fully assembled antibody was confirmed 
by western blot analysis under non-reducing conditions as shown in Fig. 7C. This is the first report 
of successful assembly of a multi-subunit human protein in transgenic chloroplasts. Production of 

15 monoclonal antibodies at agricultural level should reduce their cost and create new applications of 
monoclonal antibodies. 

Marker free chloroplast transgenic plants: Most transformation techniques co-introduce a gene 
that confers antibiotic resistance, along with the gene of interest to impart a desired trait. 

20 Regenerating transformed cells in antibiotic containing growth media permits selection of only those 
cells that have incorporated the foreign genes. Once transgenic plants are regenerated, antibiotic 
resistance genes serve no useful purpose but they continue to produce their gene products. One 
among the primary concerns of genetically modified (GM) crops is the presence of clinically 
important antibiotic resistance gene products in transgenic plants that could inactivate oral doses of 

25 the antibiotic (reviewed by Puchta 2000; Daniell 1999A). Alternatively, the antibiotic resistant 
genes could be transferred to pathogenic microbes in the gastrointestinal tract or soil rendering them 
resistant to treatment with such antibiotics. Antibiotic resistant bacteria are one of the major 
challenges of modern medicine. In Germany, GM crops containing antibiotic resistant genes have 
been banned from release (Peerenboom 2000). 

30 Chloroplast genetic engineering offers several advantages over nuclear transformation 

including high levels of gene expression and gene containment but utilizes thousands of copies of 
the most commonly used antibiotic resistance genes. Engineering genetically modified (GM) crops 
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without the use of antibiotic resistance genes should eliminate potential risk of their transfer to the 
environment or gut microbes. Therefore, .betaine aldehyde dehydrogenase (BADH) gene from 
spinach is used herein as a selectable marker (Daniell et al. 2000). The selection process involves 
conversion of toxic betaine aldehyde (BA) hy the chloroplast BADH enzyme to nontoxic glycine 
5 betaine, which also serves as an osmoprotectant. Chloroplast transformation efficiency was 25 fold 
higher in BA selection than specrinomycin, in addition to rapid regeneration (Table 1). Transgenic 
shoots appeared within 12 days in 80% of leaf discs (up to 23 shoots per disc) in BA selection 
compared to 45 days in 1 5% of discs (1 or 2 shoots per disc) on spectinomycin selection as shown in 
Fig. 8 . Southern blots confirm stable integration of foreign genes into all of the chloroplast genomes 

10 (-10,000 copies per cell) resulting in homoplasmy. Transgenic tobacco plants showed 1527 - 1816% 
higher BADH activity at different developmental stages than untransformed controls. Transgenic 
plants were morpho-logically indistinguishable from untransformed plants and the introduced trait 
was stably inherited in the subsequent generation. This is the first report of genetic engineering of 
the chloroplast genome without the use of antibiotic selection. Use of genes that are naturally 

15 present in spinach for selection, in addition to gene containment, should ease public concerns or 
perception of GM crops. Also, this should be very helpful in the development of edible insulin. 

Expression of cholera toxin 0 snbunit oligomers as a vaccine in chloroplasts: CTB, when 
administered orally (Lebens and Holmgren, 1994), is a potent mucosal immunogen, which can 

20 neutralize the toxicity of the CT holotoxin by preventing it from binding to the intestinal cells (Mor 
et al. 1998). This is believed to be a result of binding to eukaryotic cell surfaces via the Gmi 
gangliosides, receptors present on the intestinal epithelial surface, thus eliciting a mucosal immune 
response to pathogens (Lipscombe et al. 1 991) and enhancing the immune response when chemically 
coupled to other antigens (Dertzbaugh and Elson, 1993; Holmgren et al. 1993; Nashar et al. 1993; 

25 Sun etal. 1994). 

Cholera toxin (CTB) has previously been expressed in nuclear transgenic plants at levels of 
0.01 (leaves) to 0.3% (tubers) of the total soluble protein. To increase expression levels, we 
engineered the chloropiast genome to express the unmodified CTB gene (Henriques and Daniell, 
2000). We observed expression of oligomeric CTB at levels of 4 - 5% of total soluble plant protein 
30 as shown in Fig. 5 A. PCR and Southern Blot analyses confirmed stable integration of the CTB gene 
into the chloroplast genome. Western blot analysis showed that transgenic chloroplast expressed 
CTB was antigenically identical to commercially available purified CTB antigen as shown in Fig. 6. 
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Also, GNa-ganglioside binding assays confirm that chloroplast synthesized CTB binds to the 
intestinal membrane receptor of cholera toxin as shown in Fig. 5B. Transgenic tobacco plants were 
morphologically indistinguishable from untransformed plants and the introduced gene was found to 
be stably inherited in the subsequent generation as confirmed by PCR and Southern Blot analyses. 
5 The increased production of an efficient transmucosal carrier molecule and delivery system, like 
CTB, in chloroplasts of plants makes plant based oral vaccines and fusion proteins with CTB 
needing oral administration, a much more feasible approach. These observations establish 
unequivocally that chloroplasts arc capable of forming disulfide bridges to assemble foreign 
proteins, and ideal for expression of CTB fusion proteins. 

10 

Polymer-proinsulin Recombinant DNA Vectors: One possible insulin expression system involves 
independent expression of insulin chains A and B, as it has been produced in E.coli for commercial 
purposes in the past. The disadvantage of this method is that E.coli does not form disulfide bridges 
in the cell unless the protein is targeted to the periplasm. Expensive in vitro assembly after 

15 purification is necessary for this approach. Therefore, a better approach is to express the human 
proinsulin as a polymer fusion protein. This method is ideal because chloroplasts are capable of 
forming disulfide bridges. Using a single gene, as opposed to the individual chains, eliminates the 
need of conducting two parallel vector construction processes, as is the case for the individual 
chains. In addition, the need for individual fermentations and purification procedures is eliminated 

20 by the single gene method. In addition, proinsulin requires less processing following extraction. 

Recently, the human pre-proinsulin gene was obtained from Genentech, Inc. First the pre- 
proinsulin was sub-cloned into pUC19 to facilitate further manipulations. The next step was to 
design primers to make chloroplast expression vectors. Since we are interested in proinsulin 
expression, the 5' primer was designed to land on the proinsulin sequence. This FW primer excluded 

25 the 69 bases or 23 coded amino acids of the leader or pre-sequence of preproinsulin. Also, the 
forward primer included the enzymatic cleavage site for the protease factor Xa to avoid the use of 
cyanogen bromide. Besides the Xa-factor, a Smal site was introduced to facilitate subsequent 
subcloning. The order of the FW primer sequence is Smal - Xa-factor - Proinsulin gene. The 
reverse primer included BamHI and Xbal sites, plus a short sequence with homology with the 

30 pUC 1 9 sequence following the proinsulin gene. The 297bp PCR product (Xa Pris) was cloned into 
pCR2. 1 . A GVGVP 50-mer was generated as described previously (Daniell et aL 1 997) along with 
the BBS sequence GAAGGAG. Another Smal partial digestion was performed to eliminate the stop 
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codon of the biopolymer gene, decrease the 50rner to a 40mer, and fuse the 40mer to the Xa- 
proinsulin sequence. Once the correct fragment was obtained by the partial digestion of Smal 
(eliminating the stop codon but including the RBS site), it was ligated to the Xa-proinsulin fusion 
gene resulting in the construct pCR2.1-40-XaPris. Finally, the biopolymer (40mer) B proinsulin 
5 fusion gene was subcloned into the chloro-plast vector pLD-CtV or pSBL-CtV and the orientation 
was checked in the final vector using suitable restriction sites. 

Expression and Purification of the Biopolymer-proinsulin fusion protein: XL-1 Blue strain of 
E.coli containing pLD-OC-XaPris and the negative controls, which included a plasmid containing 

10 the gene in the reverse orientation and the E.coli strain without any plasmid were grown in TB broth. 
Cell pellets were resuspended in 500 pi of autoclaved dH 2 0 or 6M Guanidine hydrochloride 
phosphate buffer, pH 7.0 were sonicated and centrifuged at 4DC at 10,000 g for lOrnin. After 
centrifugation, the supematants were mixed with an equal volume of 2XTN buffer (100 mM Tris- 
HC1, pH 8, 100 mM NaCl). Tubes were warmed at 42 DC for 25 min to induce biopolymer 

1 5 aggregation. Then the fusion protein was recovered by centrifuging at 2,500 rpm at 42D C for 3 min. 
Samples were run in a 16.5% Tricine gel, transferred to the nitrocellulose membrane, and 
irnmunoblotting was performed. When the sonic extract is in 6M Guanidine Hydrochloride 
Phosphate Buffer, pH 7.0, the molecular weight changes from its original and correct MW 24 kD to 
a higher MW of approximately 30 kDa as shown in Figs. 9A and B. This is probably due to the 

20 conformation of the biopolymer in this buffer. 

The gel was first stained with 0.3M CuCl 2 and then the same gel was stained with 
Commassie R-250 Staining Solution for an hour and then destained for 15 min first, and then 
overnight CuCb creates a negative stain (Lee et al. 1987). Polymer proteins (without fusion) 
appear as clear bands against a blue background in color or dark against a light semiopaque 

25 background as shown in Fig. 9 A. This stain was used because other protein stains such as 
Coomassie Blue R250 does not stain the polymer protein due to the lack of aromatic side chains 
(McPherson et al., 1992). Therefore, the observation of the 24 kDa protein in R250 stained gel as 
shown in Fig. 9B is due to the insulin fusion with the polymer. This observation was further 
confirmed by probing these blots with the anti-human proinsulin antibody. As anticipated, the 

30 polymer insulin fusion protein was observed in western blots as shown in Figs. lOAandB. Larger 
proteins observed in Figs. 10A - C are tetramer and hexamer complexes of proinsulin. It is evident 
that the insulin-polymer fusion proteins are stable in E.coli. Confirming this observation, recently 
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others have shown that the PBP polymer protein conjugates (with thioredoxin and tendamistat) 
undergo thermally reversible phase transition, retaining the transition behavior of the free polymer 
(Meyer and Chilkoti, 1999). These results clearly demonstrate that insulin fusion has not affected 
the inverse temperature transition property of the polymer. One of the concerns is the stability of 
5 insulin at temperatures used for thermally reversible purification. Temperature induced production 
of human insulin has been in commercial use (Schmidt et al. 1999). Also, the temperature transition 
can be lowered by increasing the ionic strength of the solution during purification of this PBP 
(McPherson et al. 1996). Thus, GVGVP-fusion could be used to purify a multitude of economically 
important proteins in a simple inexpensive step. 

10 

Biopolymer-proinsulin fusion gene expression in chloroplast: As described in section d, 
chloroplast vector was bombarded into the tobacco chloroplast genome via particle bombardment 
(Daniell, 1997). PCR and Southern Blots were performed to confirm biopolymer-proinsulin fusion 
gene integration into chloroplast genome. Southern blots show homoplasmy in most T 0 lines but a 
15 few showed some heteroplasmy as shown in Fig. 11. Western blots show the expression of polymer 
proinsulin fusion protein in all transgenic lines in Fig. 10C. Quantification is by EUSA. 

Protease Xa Digestion of the Biopolymer-proinsulin fusion protein and Purification of 
Proinsulin: The enzymatic cleavage of the fusion protein to release the proinsulin protein from the 

20 (GVGVP) 40 was initiated by adding the factor 10A protease to the purified fusion protein at a ratio 
(w/w) of approximately 1:500. Cleavage of the fusion protein was monitored by SDS-PAGE 
analysis. We detected cleaved proinsulin in the extracts isolated in 6M guanidine hydrochloride 
buffer as shown in Figs. 10A and B. Conditions are noweing optimized for complete cleavage. The 
Xa protease has been successfully used previously to cleave (GVGVPho-GST fusion (McPherson et 

25 al. 1992). 

Evaluation of chloroplast gene expression: A systematic approach to identify and overcome 
potential limitations of foreign gene expression in chloroplasts of transgenic plants is essential. 
Information gained herein increases the utility of chloroplast transformation system by scientists 
30 interested in expressing other foreign proteins. Therefore, it is important to systematically analyze 
transcription, RNA abundance, RNA stability, rate of protein synthesis and degradation, proper 
folding and biological activity. For example, the rate of transcription of the introduced insulin gene 
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may be compared with the highly expressing endogenous chloroplast genes (rbcL, psbA, 16S 
rRNA), using run on transcription assays to determine if the 16SrRNA promoter is operating as 
expected. Transgenic chloroplast containing each of the three constructs with different 5 1 regions is 
investigated to test their transcription efficiency. Similarly, transgene RNA levels is monitored by 

5 northerns, dot blots and primer extension relative to endogenous rbcL, 1 6S rRNA, or psbA. These 
results along with run on transcription assays should provide valuable information of RNA stability, 
processing, etc. With our past experience in expression of several foreign genes, foreign transcripts 
appear to be extremely stable based on northern blot analysis. However, a systematic study is 
valuable to advance utility of this system by other scientists. 

1 o Importantly, the efficiency of translation may be tested in isolated chloroplasts and compared 

with the highly translated chloroplast protein (psbA). Pulse chase experiments help assess if 
translational pausing, premature termination occurs. Evaluation of percent RNA loaded on 
polysomes or in constructs with or without 5'UTRs helps determine the efficiency of the ribosome 
binding site and 5' stem-loop translational enhancers; Codon optimized genes are also compared 

1 5 with unmodified genes to investigate the rate of translation, pausing and termination. In our recent 
experience, we observed a 200-fold difference in accumulation of foreign proteins due to decreases 
in proteolysis conferred by a putative chaperonin (De Cosa et al. 2001). Therefore, proteins from 
constructs expressing or not expressing the putative chaperonin (with or without ORF1+2) provide 
valuable information on protein stability. Thus, all of this information may be used to improve the 

20 next generation of chloroplast vectors. 

Optimization of gene expression: We have reported that foreign genes are expressed between 3% 
(cry2Aa2) and 46% (cry2Aa2 operon) in transgenic chloroplasts (Kota et al. 1999; De Cosa et al. 
2001). Several approaches may be used to enhance translation of the recombinant proteins. In 

25 chloroplasts, transcriptional regulation as a bottle-neck in gene expression has been overcome by 
utilizing the strong constituitive promoter of the 16s rRNA (Prrn). One advantage of Prrn is that it is 
recognized by both the chloroplast encoded RNA polymerase and the nuclear encoded chloroplast 
RNA polymerase in tobacco (Allison et al. 1996). Several investigators have utilized Prrn in their 
studies to overcome the initial hurdle of gene expression, transcription (De Cosa et al. 2001, Eibl et 

30 aL 1999, Staub et al. 2000). RNA stability appears to be one among the least problems because of 
observation of excessive accumulation of foreign transcripts, at times 16,966-fold higher than the 
highly expressing nuclear transgenic plants (Lee et al. 2000). Also, other investigations regarding 



WO 01/72959 PCT/US01/06288- 

154 

1465-PCT-00 (1 577-P-OO) 

RNA stability in chloroplasts suggest that efforts for optimizing gene expression need to be 
addressed at the post-transcriptional level (Higgsetal. 1999, Eibletal. 1999). Our work focuses on 
addressing protein expression post-transcriptionally . For example, 5= and 3= UTRs are needed for 
optimal translation and mRNA stablility of chloroplastrnRNAs(Zerges2000). Optimal ribosornal 

5 binding sites (RBS=s) as well as a stem-loop structure located 5= adj acent to the RBS are needed for 
efficient translation. A recent study has shown that replacement of the Shine-Delgarno (GGAGG) 
with the psbA 5= UTR downstream of the 16S rRNA promoter enhanced translation of a foreign 
gene (GUS) hundred-fold (Eibl et al. 1999). Therefore, the 200-bp tobacco chloroplast DNA 
fragment (1680-1480) containing 5= psbA UTR may be vised. Ibis PCR product is inserted 

10 downstream of the 1 6S rRNA promoter to enhance translation of the recombinant proteins. 

Yet another approach for enhancement of translation is to optimize codon compositions. We 
have compared A+T% content of all foreign genes that had been expressed in transgenic chloroplasts 
with the percentage of chloroplast expression. We found that higher levels of A+T always correlated 
with high expression levels (see Table 2). It is also potentially possible to modify chloroplast 

15 protease recognition sites while modifying codons, without affecting their biological functions. 
Therefore, optimizing codon compositions of insulin and polymer genes to match the psbA gene 
should enhance the level of translation. Although rbcL (RuBisCO) is the most abundant protein on 
earth, it is not translated as highly as the psbA gene due to the extremely high turnover of the psbA 
gene product. The psbA gene is under stronger selection for increased translation efficiency and is 

20 the most abundant thylakoid protein. In addition, the codon usage in higher plant chloroplasts is 
biased towards the NNC codon of 2-fold degenerate groups (i.e. TTC over TTT, GAC over GAT, 
CAC over CAT, AAC over AAT S ATC over ATT, ATA etc.). This is in addition to a strong bias 
towards T at the third position of 4-fold degenerate groups, There is also a context effect that should 
be taken into consideration while modifying specific codons. The 2-fold degenerate sites 

25 immediately upstream from a GNN codon do not show this bias towards NNC. (TTT GGA is 
preferred to TTC GGA while TTC CGT is preferred to TTT CGT, TTC AGT to TTT AGT and TTC 
TCT to TTT TCT, Morton, 1993; Morton and Bernadette, 2000). In addition, highly expressed 
chloroplast genes use GNN more frequently that other genes. The disclosure of web site 
http://wwwJcazusa.or.jp/codon and http://www.ncbi.nlm.nili.gov may be used to optimize codon 

30 composition by comparing codon usage of different plant species= genomes and PsbA=s genes. 
Abundance of amino acids in chloroplasts and tRNA anticodons present in chloroplast may be taken 
into consideration. Optimization of polymer and proinsulin may be performed using a novel PCR 
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approach (Prodromou and Pearl, 1992; Casimiro et al. 1997), which has been successfully used in 
our laboratory to optimize codon composition of other human proteins. 

Vector constructions: pLD vector is used for all the constructs. This vector was developed for 

5 chloroplast transformation. It contains the 16S rRNA promoter (Prm) driving the selectable marker 
gene aadA (aminoglycoside adenyl transferase conferring resistance to spectinomycin) followed by 
the multiple cloning site and then the psbA 3 ! region (the terminator from a gene coding for 
photosystem II reaction center components) from the tobacco chloroplast genome. The pLD vector 
is a universal chloroplast expression /integration vector and can be used to transform chloroplast 

10 genomes of several other plant species (Daniell et al. 1998, Daniell 1999) because these flanking 
sequences are highly conserved among higher plants. The universal vector uses trnA and tntl genes 
(chloroplast transfer RNAs coding for Alanine and Isoleucine) from the inverted repeat region of the 
tobacco chloroplast genome as flanking sequences for homologous recombination. Because the 
universal vector integrates foreign genes within the Inverted Repeat region of the chloroplast 

15 genome, it should double the copy number of the transgene (from 5000 to 10,000 copies per cell in 
tobacco). Furthermore, it has been demonstrated that homoplasmy is achieved even in the first 
round of selection in tobacco probably because of the presence of a chloroplast origin of replication 
within the flanking sequence in the universal vector (thereby providing more templates for 
integration). These, and several other reasons, foreign gene expression was shown to be much 

20 higher when the universal vector was used instead of the tobacco specific vector (Guda et al. 2000). 

CTB-Proinsulin Vector Construction: The chloroplast expression vector pLD-CTB-Proins may 
be constructed as follows. First, both proinsulin and cholera toxin B-subunit genes were amplified 
from suitable DNA using primer sequences. Primer 1 contains the GGAGG chloroplast preferred 

25 ribosome binding site five nucleotides upstream of the start codon (ATG) for the CTB gene and a 
suitable restriction enzyme site (Spel) for insertion into the chloroplast vector. Primer 2 eliminates 
the stop codon and adds the first two amino acids of a flexible hinge tetrapeptide GPGP as reported 
by Bergerot et al. (1997), to facilitate folding of the CTB-proinsulin fusion protein. Primer 3 adds 
the re maining two amino acids for the hinge tetra-peptide and eliminates the pre-sequence of the 

30 native pre-proinsulin. Primer 4 adds a suitable restriction site (Spel) for subcloning into the 
chloroplast vector. Amplified PCR products may be inserted into the TA cloning vector. Both the 
CTB and proinsulin PCR fragments may be excised at the Smal and Xbal restriction sites. Eluted 
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fragments are ligated into the TA cloning vector. The CTB-proinsulin fragment may be excised at 
the EcoRI sites and inserted into EcoRI digested dephosphorolated pLD vector. 

The following vectors may be designed to optimize protein expression, pTiriflcation and 
production of proteins with the same amino acid composition as in human insulin. 

5 



10 



15 

a) Using tobacco plants, Eibl (1999) demonstrated, in vivo, the differences in translation 
efficiency and mRNA stability of a GUS reporter gene due to various 5= and 3= untranslated 

20 regions (UTR=s). This already described systematic transcription and translation analysis 

can be used in a practical endeavor of insulin production. Consistent with Eibl=s (1999) data 
for increased translation efficiency and mRNA stability, the psbA 5= UTR can be used in 
addition with the psbA 3= UTR already in use. The 200 bp tobacco chloroplast DNA 
fragment containing 5= psbA UTR may be amplified by PCR using tobacco chloroplast 

25 DNA as template. This fragment may be cloned directly in the pLD vector multiple cloning 

site downstream of the promoter and the aadA gene. The cloned sequence may be exactly 
the same as in the psbA gene. 

b) Another approach of protein production in chloroplasts involves potential insulin 
crystallization for facilitating purification. The crylAal Bacillus thuringiensis operon 

30 derived putative chaperonin may be used. Expression of the cry2Aa2 operon in chloroplasts 

provides a model system for hyper-expression of foreign proteins (46% of total soluble 
protein) in a folded configuration enhancing their stability and facilitating purification (De 
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Cosa et al. 2001). This justifies inclusion of the putative chaperonin from the cry2Aa2 
operon in one of the newly designed constructs. In this region there are two open reading 
frames (ORF1 and ORF2) and a ribosomal binding site (rbs). This sequence contains 
elements necessary for CrylAal crystallization, which help to crystallize insulin and aid in 

5 subsequent purification. Successful crystallization of other proteins using this putative 

chaperonin has been demonstrated (Ge et al. 1998). The ORF1 and ORF2 of the Bt Cry2Aa2 
operon may be amplified by PCR using the complete operon as a template. Subsequent 
cloning, using a novel PCR technique, allows for direct fusion of this sequence immediately 
upstream of the proinsulin fusion protein without altering the nucleotide sequence, which is 

10 normally necessary to provide a restriction enzyme site (Horton et al. 1988). 

c) To address codon optimization the proinsulin gene may be subjected to certain modifications 
in subsequent constructs. The plastid modified proinsulin (PtPris) can have its nucleotide 
sequence modified such that the codons are optimized for plastid expression, yet its arnino 
acid sequence remains identical to human proinsulin. PtPris is an ideal substitute for human 

1 5 proinsulin in the CTB fusion peptide. The expression of this construct can be compared to 

the native human proinsulin to determine the affects to codon optimization, which serve to 
address one relevant mechanistic parameter of translation. Analysis of human proinsulin 
gene showed that 48 of its 87 codons were the lowest frequency codons in the chloropiast for 
the arnino acid for which they encode. For example, there are six different codons for 

20 leucine. Their frequency within the chloropiast genome ranges from 7.3 to 30.8 per 

thousand codons. There are 12 leucines in proinsulin, 8 have the lowest frequency codons 
(7.3), and none code for the highest frequency codons (30.8). In the plastid, optimized 
proinsulin gene all the codons code for the most frequent whereas in human proinsulin over 
half of the codons are the least frequent. Human proinsulin nucleotide sequence contains 

25 62% C+G, whereas plastid optimized proinsulin gene contain 24% C+G. Generally, lower 

C+G content of foreign genes correlates with higher levels of expression (Table 2). 

d) Another version of the proinsulin gene, mim-proinsulin (Mpris), may also have its codons 
optimized for plastid expression, and its amino acid sequence does not differ from human 
proinsulin (Pris), Pris= sequence is B Chain-RR-C Chain-KR-A Chain, whereas MPris= 

30 sequence is B Chain-KR-A Chain. The MPris sequence excludes the RR-C Chain, which is 

normally excised in proinsulin maturation to insulin. The C chain of proinsulin is an 
unnecessary part of in vitro production of insulin. Proinsulin folds properly and forms the 
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appropriate disulfide bonds in the absence of the C chain. The remaining KR motif that 
exists between the B chain and the A cbain in MPris allows for mature insulin production 
upon cleavage with trypsin and carboxypeptidase B. This construct may be used for our 
biopolymer fusion protein. It=s codon optimization and amino acid sequence is ideal for 
5 mature insulin production. 

e) Our current human proinsulin-biopolymer fusion protein contains a factor Xa proteolytic cut 
site, which serves as a cleavage point between the biopolymer and the proinsulin. Currently, 
cleavage of the polymer-proinsulin fusion protein with the factor Xa has been inefficient in 
our hands. Therefore, we replace this cut site with a trypsin cut site. This eliminates the 

10 need for the expensive factor Xa in processing proinsulin. Since proinsulin is currently 

processed by trypsin in the formation of mature insulin, insulin maturation and fusion 
peptide cleavage can be achieved in a single step with trypsin and carboxypeptidase B. 

f) We observed incomplete translation products in plastids when we expressed the 120mer gene 
(Guda et al. 2000). Therefore, while expressing the polymer-proinsulin fusion protein, we 

1 5 decreased the length of the polymer protein to 40mer, without losing the thermal responsive 

property. In addition, optimal codons for glycine (GGT) and valine (GTA), which constitute 
80% of the total amino acids of the polymer, have been used. In all nuclear encoded genes, 
glycine makes up 147/1 000 amino acids while in tobacco chloroplasts it is 129/1000. Highly 
expressing genes like psbA and rbcL of tobacco make up 192 and 190gly/1000. Therefore, 

20 glycine may not be a limiting factor. Nuclear genes use 52/1000 proline as opposed to 

42/1000 in chloroplasts. However, currently used codon for proline (CCG) can be modified 
to CCA or CCT to further enhance translation. It is known that pathways for proline and 
valine are compartmentalized in chloroplasts (Guda et al. 2000). Also, proline is known to 
accumulate in chloroplasts as an osmoprotectant (Daniell et al, 1994). 

25 g) Codon comparison of the CTB gene with psbA, showed 47% homology with the most 
frequent codons of the psbA gene. Codon analysis showed that 34% of the codons of CTB 
axe complimentary to the tRNA population in the chloroplasts in comparison with 51% of 
psbA codons that are complimentary to the chloroplast tRNA population. Because of the 
high levels of CTB expression in transgenic chloroplasts, (Henriques and Daniell, 2000), 

30 there will be no need to modify the CTB gene. 

DNA sequence of all constructs may be detennined to confirm the correct orientation of 
genes, in frame fusion, and accurate sequences in the recombinant DNA constructs. DNA 
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sequencing may be performed using a Perkin Elmer ABI prism 373 DNA sequencing system using a 
ABI Prism Dye Termination Cycle Sequencing kit Insertion sites at both ends may be sequenced by 
using primers for each strand. 

Expression of all chloroplast vectors are first tested in E coli before their use in tobacco 
5 transformation because of the similarity of protein synthetic machinery (Brixley et al. 1997). For 
Escherichia coli expression XL-1 Blue strain was used. E.coli may be transformed by a standard 
CaCfe method 

Bombardment and Regeneration of Chloroplast Transgenic Plants: Tobacco (Nicotiana 

10 tabacum var. Petit Havana) and nicotine free edible tobacco (LAMD 605, gift from Dr. Keith 
WycoiT, Planet Biotechnology) plants may be grown aseptically by germination of seeds on MSO 
medium (Daniell 1993). Fully expanded, dark green leaves of about two month old plants may be 
used for bombardment. 

Leaves may be placed abaxial side up on a Whatman No. 1 filter paper laying on the RMOP 

15 medium (Daniell, 1993) in standard petri plates (100 x 15 mm) for bombardment. Gold (0.6 pm) 
microprojectiles may be coated with plasmid DNA (chloroplast vectors) and bombardments carried 
out with the biolistic device PDSIOOO/He (Bio-Rad) as described by Daniell (1997). Following 
bombardment, petri plates are sealed with parafilm and incubated at 24DC under 12 h photoperiod. 
Two days after bombardment, leaves are chopped into small pieces of ~5 mm 2 in size and placed on 

20 the selection medium (RMOP containing 5 00 ug/ml of spectinomycin dihydrochloride) with abaxial 
side touching the medium in deep (100 x 25 mm) petri plates (-10 pieces per plate). The regenerated 
spectinomycin resistant shoots are chopped into small pieces (-2mm 2 ) and subcloned into fresh deep 
petri plates (-5 pieces per plate) containing the same selection medium. Resistant shoots from the 
second culture cycle are transferred to the rooting medium (MSO medium supplemented with IB A, 1 

25 mg/liter and spectinomycin dihydrochloride, 500 mg/liter). Rooted plants may be transferred to soil 
and grown at 26 DC under continuous lighting conditions for further analysis. 

Polymerase Chain Reaction: PCR may be performed using DNA isolated from control and 
transgenic plants to distinguish a) true chloroplast transfonnants from mutants and b) chloroplast 
30 transfonnants from nuclear transfonnants. Primers for testing the presence of the aadA gene (that 
confers spectinomycin resistance) in transgenic plants may be landed on the aadA coding sequence 
and 16S rRNA gene (primers 1P&1M,). To test chloroplast integration of the insulin gene, one 
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primer lands on the aadA gene while another lands on the native chloroplast genome (primers 
3P&3M). No PCR product is obtained with nuclear transgenic plants using this set of primers. Hie 
primer set (2P & 2M) may be used to test integration of the entire gene cassette without any internal 
deletion or looping out during homologous recombination, by landing on the respective 

5 recombination sites. A similar strategy has been used successfully by us to confirm chloroplast 
integration of foreign genes (Daniell et aL, 1998; Kota et ah, 1999; Guda et al., 2000). This 
screening is essential to eliminate mutants and nuclear transfonnants. Total DNA from 
unbombarded and transgenic plants may be isolated as described by Edwards et al. (199 1) to conduct 
PCR analyses in transgenic plants. Chloroplast transgenic plants containing the proinsulin gene may 

10 then be moved to second round of selection to achieve homoplasmy. 

Southern Blot Analysis: Southern blots are performed to determine the copy number of the 
introduced foreign gene per cell as well as to test homoplasmy. There are several thousand copies of 
the chloroplast genome present in each plant cell Therefore, when foreign genes are inserted into 

15 the chloroplast genome, it is possible that some of the chloroplast genomes have foreign genes 
integrated while others remain as the wild type (heteroplasmy). Therefore, to ensure that only the 
transformed genome exists in cells of transgenic plants (homoplasmy), the selection process is 
continued. To confirm that the wild type genome does not exist at the end of the selection cycle, 
total DNA from transgenic plants should be probed with the chloroplast border (flanking) sequences 

20 (the tml-trnA fragment as shown in Figs. 2A and 3B. If wild type genomes are present 
(heteroplasmy), the native fragment size is observed along with transformed genomes. The presence 
of a large fragment (due to insertion of foreign genes within the flanking sequences) and absence of 
the native small fragment confirms homoplasmy (Daniel! etal., 1998; Kota etaL, 1999; Gudaetal., 
2000). 

25 The copy number of the integrated gene is determined by establishing homoplasmy for the 

transgenic chloroplast genome. Tobacco Chloroplasts contain 5000-10,000 copies of their genome 
per cell (Daniell et al. 1998). If only a fraction of the genomes are actually transformed, me copy 
number, by default, must be less than 10,000. By establishing that in the transgenics the insulin 
inserted transformed genome is the only one present, one can establish that the copy number is 

30 5000-10,000 per cell. This is usually done by digesting the total DNA with a suitable restriction 
enzyme and probing with the flanking sequences that enable homologous recombination into the 
chloroplast genome. The native fragment present in the control should be absent in the transgenics. 
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The absence of native fragment proves that only the transgenic chloroplast genome is present in the 
cell and there is no native, untransformed, chloroplast genome, without the insulin gene present. 
This establishes the homoplasmic nature of the transformants, simultaneously providing an estimate 
of 5000-10,000 copies of the foreign genes per cell. 

5 

Northern Blot Analysis: Northern blots may be performed to test the efficiency of transcription of 
the proinsulin gene fused with CTB or polymer genes. Total RNA is isolated from 1 50 mg of frozen 
leaves by using the "Rneasy Plant Total RNA Isolation Kit" (Qiagen Inc., Chatsworth, CA). RNA 
(10-40 ug) is denatured by formaldehyde treatment, separated on a 1 .2% agarose gel in the presence 
1 0 of formaldehyde and transferred to a nitrocellulose membrane (MSI) as described in Sambrook et al. 
(1989). Probe DNA (proinsulin gene coding region) may be labeled by the random-primed method 
(Promega) with 32 P-dCTP isotope. The blot is then pre-hybridized, hybridized and washed as 
described above for southern blot analysis. Transcript levels may be quantified by the Molecular 
Analyst Program using the GS-700 Imaging Densitometer (Bio-Rad, Hercules, CA) or the like. 

15 

Polymer-insulin fusion protein purification, quantitation and characterization: Because 
polymer insulin fusion proteins exhibit inverse temperature transition properties as shown in Figs. 9 
and 10, they may be purified from transgenic plants essentially following the same method described 
for polymer purification from transgenic tobacco plants (Zhang et al.,1996). Polymer extraction 

20 buffer contains 50 rnM Tris-HCl, pH, 7.5, 1% 2-mecaptoethanol, 5mM EDTA and 2rnM PMSF and 
0.8MNaCl. The homogenate is then centrifuged at 10,000gfor 1 0 minutes (4 DC), and the pellet 
discarded. The supernatant is incubated at A2U C for 30 minutes and then centrifuged immediately 
for 3 minutes at 5,000 g (room temperature). If insulin is found to be sensitive to this temperature, 
T t is lowered by increasing salt concentration (McPherson et aL, 1996). The pellet containing the 

25 insulin-polymer fusion protein is resuspended in the extraction buffer and incubated on ice for 10 
minutes. The mixture is centrifuged at 12,000 g for 10 minutes (4DC). The supernatant is then 
collected and stored at -20D C. The purified polymer insulin fusion-protein is electrophoresed in a 
SDS-PAGE gel according to Laemmli (1970) and visualized by either staining with 0.3 M CuCb 
(Lee et al. 1 987) or transferred to nitrocellulose membrane and probed with antiserum raised against 

30 the polymer or insulin protein as described below. Quantification of purified polymer proteins may 
be carried out by ELIS A in addition to densitometry. 

After electrophoresis, proteins may be transferred to a nitrocellulose membrane 
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electrophoretically in 25 mM Tris, 192 mM glycine, 5% methanol (pH 8.3). The filter is blocked 
with 2% dry milk in Tris-buffered saline for two hours at room temperature and stained with 
antiserum raised against the polymer AVGVP (kindly provided by the University of Alabama at 
Birmingham, monoclonal facility) overnight in 2% dry milk/Tris buffered saline. The protein bands 

5 reacting to the antibodies are visualized using alkaline phosphatase-linked secondary antibody and 
the substrates nitroblue tetrazolium and 5-bromo-4-chioro-3-indolyl-phosphate (Bio-Rad). 
Alternatively, for insulin-polymer fusion proteins, a Mouse anti-human proinsulin (IgGl) 
monoclonal antibody may be used as a primary antibody. To detect the binding of the primary 
antibody to the recombinant proinsulin, a Goat anti-mouse IgG Horseradish Peroxidase Labeled 

10 monoclonal antibody (HPR) may be used. The substrate to be used for conjugation with HRP may 
be 3,3=, 5,5=- Tetramethylbenzidine. Products may be purchased from American Qualex Antibodies 
in San Clemente, CA. As a positive control, human recombinant proinsulin from Sigma may be 
used. This human recombinant proinsulin was expressed in E.coli by a synthetic proinsulin gene. 
Quantification of purified polymer fusion proteins may be carried out by densitometry using 

15 Scanning Analysis software (BioSoft, Ferguson, MO). Total protein contents maybe determined by 
the dye-binding assay using reagents supplied in kit from Bio-Rad, with bovine serum albiimin as a 
standard. 

Characterization of CTB expression: CTB protein levels in transgenic plant crude extract can be 
20 determined using quantitative ELISA assays. A standard curve may be generated using known 
concentrations of bacterial CTB. A 96-well microliter plate loaded with 100 ul/well of bacterial 
CTB (concentrations in the range of 10 - 1000 ng) is incubated overnight at 40C. The plate is then 
washed thrice with PBST (phosphate buffered saline containing 0.05% Tween-20). The background 
may be blocked by incubation in 1% bovine serum albumin (BSA) in PBS (300 ul/well) at 37DC for 
25 2 h followed by washing 3 times with PBST. The plate may be incubated in a 1 :8,000 dilution of 
rabbit anti-cholera toxin antibody (Sigma C-3062) (100 Dl/well) for 2 h at 37DC, followed by 
washing the wells three times with PBST. The plate may be incubated with a 1 : 80,000 dilution of 
anti-rabbit IgG conjugated wim alkaline phoshatase (100 Dl/well) for 2 h at 3 7 □ C and washed thrice 
with PBST. Then, 100 □ I alkaline phosphatase substrate (Sigma Fast p-nitrophenyi phosphate tablet 
30 in 5ml ofwater is added and the reaction stopped with lMNaOH(50 Dl/well) when absorbancies 
in the mid-range of the titration reach about 2.0, or after 1 hour, whichever comes first. The plate is 
then be read at 405nm. These results are used to generate a standard curve from which 
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concentrations of plant protein are extrapolated. Tims, total soluble plant protein (concentration 
previously determined using the Bradford assay) in bicarbonate buffer, pH 9.6 (15mM NajCos, 
35mM NaHCOa) may be loaded at 100 plant Dl/well and the same procedure as above can be 
repeated. The absorbance values can be used to determine the ratio of CTB protein to total soluble 
5 plant protein, using the standard curve generated previously and the Bradford assay results. 

Inheritance of Introduced Foreign Genes: While it is unlikely that introduced DNA move from 
the chloroplast genome to nuclear genome, it is possible that the gene can be integrated in the 
nuclear genome during bombardment and remain undetected in Southern analysis. Therefore, in 

10 initial tobacco transformants, some are allowed to self-pollinate, whereas others are used in 
reciprocal crosses with control tobacco (transgenics as female accepters and pollen donors; testing 
for maternal inheritance). Harvested seeds (TL) are germinated on media containing spectinomycin. 
Achievement of hornoplasmy and mode of inheritance can be classified by looking at germination 
results. Hornoplasmy is indicated by totally green seedlings (Darnell et al., 1998) while 

15 heteroplasmy is displayed by variegated leaves (lack of pigmentation, Svab & Maliga, 1993). Lack 
of variation in chlorophyll pigmentation among progeny also underscores the absence of position 
effect, an artifact of nuclear transformation. Maternal inheritance is demonstrated by sole 
transmission of introduced genes via seed generated on transgenic plants, regardless of pollen source 
(green seedlings on selective media). When transgenic pollen is used for pollination of control 

20 plants, resultant progeny do not contain resistance to chemical in selective media (appears bleached; 
Svab and Maliga, 1993). Molecular analyses can confirm transmission and expression of introduced 
genes, and T2 seed is generated from those confirmed plants by the analyses described above. 

Comparison of Current Purification with Polymer-based Purification Methods: It is important 
25 to compare purification methods by testing yield and purity of insulin produced in E.coli and 
tobacco. Three methods may be compared: a standard fusion protein in E. coli, polymer proinsulin 
fusion protein in E.coli, and polymer proinsulin fusion in tobacco. Polymer proinsulin fusion 
peptide from transgenic tobacco may be purified by methodology described in section c) and Daniell 
(1997). E.coli purification is performed as follows. One liter of each pLD containing bacteria is 
30 grown in LB/ ampicillin ( 1 00 □ g/rnl) overnight and the fusion protein, either polymer-proinsulin or 
the control fusion protein (Cowley and Mackin 1997), expressed. Cells are harvested by 
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centrifugation at 5000 X g for 10 min at 4DC, and the bacterial pellets resuspended in 5 ml/g (wet 
wt. Bacteria) of 100 rnM Tris-HCl, pH 7.3. Lysozyme is added at a concentration of I mg/ml and 
placed on a rotating shaker at room temperature for 15 rnin. The lysate is subjected to probe 
sonication for two cycles of 30 s on/30 s off at40C. Cellular debris is removed by centrifugation at 
5 1 000 X g for 5 min at 4 DC. The E. coli produced proinsulin polymer fusion protein is purified by 
inverse temperature transition properties (Daniell et al., 1997). After Factor Xa cleavage (as 
described in section c)) the proinsulin is isolated from the polymer using inverse temperature 
transition properties (Daneill et al. s 1997) and subject to oxidative sulfitolysis as described below. 
Alternatively, the control fusion protein is purified according to Cowley and Mackin (1997) as 

10 follows. The supernatant is retained and centrifuged again at 27000 Xg for 15 min at 4DC to pellet 
the inclusion bodies. The supernatant is then discarded and the pellet resuspended in 1 ml/g 
(original wt. Bacteria) of dH^O, aliquoted into microcentrifuge tubes as 1 ml fractions, and then 
centrifuged at 16000 X g for 5 min at 4DC. The pellets are individually washed with 1 ml of 100 
mMTris-HCl, pH 8.5, lMurea, 1-1 TritonX-100 andagainwashedwithl00mMTrisHClpH8.5,2 

15 Murea,2%TrintonX-100. The pellets are then resuspended in 1 ml of dfibO and transferred to a 
pre-weighed30mlCor^centrifiigetube. The sample is centrifuged at 15000Xgfor5minat4OC s 
and the pellet resuspended in 10 ml/g (wet wt. pellet) of 70% formic acid. Cyanogen bromide is 
added to a final concentration of 400 mM and the sample incubated at room temperature in the dark 
for 16 h. The reaction is stopped by transferring the sample to a round bottom flask and removing 

20 the solvent by rotary evaporation at 50 DOC. The residue is resuspended in 20 mi/g (wet wt. pellet) 
of dEbO, shell frozen in a dry ice ethanol bath, and then lyophilized. The lyophilized protein is 
dissolved in 20 ml/g (wet wt pellet) of 500 mM Tris-HCL pH 8.2, 7 M urea. Oxidative sulfitolysis 
may be performed by adding sodium sulfite and sodium tetrathionate to final concentrations of 100 
and 1 0 mM, respectively, and incubating at room temperature for 3 h. This reaction is stopped by 

25 freezing on dry ice. 

Purification and folding of Human Proinsulin: The iS-sulfonated material may be applied to a 2 
ml bed of Sephadex G-25 equilibrated in 20 mM Tris-HCL, pH 8 .2, 7 M urea, and then washed with 
9 vols of 7 M urea. The collected fraction is applied to a Pharmacia Mono Q HR 5/5 column 
30 equilibrated in 20 mM Tris HC1, pH 8.2, 7 M urea at a flow rate of 1 mVmin. A linear gradient 
leading to final concentration of 0.5 M NaCl is used to elute the bound material. 2 rnin (2 ml) 
fractions are collected during the gradient, and protein concentration in each fraction determined. 
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Purity and molecular mass of fractions is estimated by Tricine SDS-PAGE (as shown in Fig. 2), 
where Tricine is used as the trailing ion to allow better resolution of peptides in the range of 1 - 1000 
kDa. Appropriate fractions are pooled and applied to a 1.6 X 20 cm column of Sephadex G-25 
(superfine) equilibrated in 5 mM ammonium acetate pH 6.8. The sample is collected based on UV 

5 absorbance and freeze-dried. The partially purified ^-sulfonated material is then resuspended in 50 
mM glycine/NaOH, pH 10.5 at a final concentration of 2 mg/mL p-mercaptoethanol is added at a 
ratio of 1.5 mol per mol of cysteine ^-sulfonate and the sample stirred at 4UC in an open container 
for 1 6 h. The sample is then analyzed by reversed-phase high-performance liquid chromatography 
(RP-HPLC) using a Vydac C 4 column (2.2 X 150 mm) equilibrated in 4% acetonitrile and 0.1% 

10 TFA. Adsorbed peptides are eluted with a linear gradient of increasing acetonitrile concentration 
(0.88% per min up to a mavimum of 48%). The remaining refolded proinsulin is centrifuged at 
1 6000 X g to remove insoluble material, and loaded onto a semi-preparative Vydac C 4 column (10 X 
250 mm). The bound material is then eluted as described above, and the proinsulin collected and 
lyophilized. 

15 

Analysis and characterization of insulin expressed in E.coli and Tobacco: The purified 
expressed proinsulin is subjected to matrix-assisted laser desorptiori/iorrization-time of flight 
(MALDI-TOF) analysis (as described by Cowley and Mackin, 1 997), using proinsulin from Eli Lilly 
as both an internal and external standard. To determine if the disulfide bridges have formed 

20 correctly naturally inside chloroplasts or by in vitro processing, a proteolytic digestion id performed 
using Staphylococcus aureus protease V8. Five Dg of both the expressed proinsulin and Eli Lilly=s 
proinsulin are lyophilized and resuspended in 50 Dl of 250 mM NaP0 4 > pH 7.8. Protease V8 is 
added at a ratio of 1:50 (w/w) in experimental samples and no enzyme added to the controls. All 
samples are then incubated overnight at 37 DC, the reactions stopped by freezing on dry ice, and 

25 samples stored at -20O OC until analyzed. The samples are analyzed by RP-HPLC using a Vydac C 4 
column (2.2 X 150 mm) equilibrated in 4% acetonitrile and 0.1% TFA. Bound material is then 
eluted using a linear gradient of increasing acetonitrile concentration (0.88% per min up to a 
maximum of 48%). 

30 CTB-GM1 ganglioside binding assay: A GM 1 -ELIS A assay may be performed as described by 
Arakawa et aL (1997) to determine the affinity of plant-derived CTB for GM1 -ganglioside. The 
microtiter plate is coated with monosialoganglioside-GMl (Sigma G-7641) by incubating the plate 
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with 100 pi/well of GM1 (3.0 ug/ml) in bicarbonate buffer, pH 9.6 at 4 DC overnight Alternatively, 
the wells can be coated with 100 ul/well of BSA (3.0 ug/ml) as control. The plates are incubated 
with transformed plant total soluble protein and bacterial CTB (Sigma C-9903) in PBS ( 1 00 ul/well) 
overnight at 4DC. The remainder of the procedure is identical to the EUSA described above. 

5 

Induction of oral tolerance: Four week old female NOD mice may, for example, be purchased 
from Jackson Laboratory (Bar Harbor, ME) and housed at an animal care facility. The mice are 
divided into three groups, each group consisting of ten mice. Each group is fed one of the following 
nicotine free edible tobacco: untransfonned, expressing CTB, or expressing CTB-proinsulin fusion 
10 protein. Beginning at 5 weeks of age, each mouse is fed 3 g of nicotine free edible tobacco once per 
week until reaching 9 weeks of age (a total of five feedings). 

Antibody titer: At ten weeks of age, the serum and fecal material are assayed for anti-CTB and 
anti-proinsulin antibody isotypes using the ELESA method described above. 

15 

Assessment of diabetic symptoms in NOD mice: The incidence of diabetic symptoms can be 
compared among mice fed with control nicotine free edible tobacco that expresses CTB and those 
that express the CTB-proinsulin fusion protein. Starting at 10 weeks of age, the mice are monitored 
on a biweekly basis with urinary glucose test strips (Chnisux and Diastix, Bayer) for development of 
20 diabetes. Glycosuric mice are bled from the tail vein to check for glycemia using a glucose analyzer 
(Accu-Check, Boehringer Mannheim). Diabetes is confirmed by hyperglycemia (>250 mg/dl) for 
two consecutive weeks (Ma et al. 1997). 
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a. SPECIFIC AIMS 

Research on human proteins in the past years has revolutionized the use of these 
therapeutically valuable proteins in a variety of clinical situations. Since the demand for these 
proteins is expected to increase considerably in the coming years, it would be wise to ensure that 
5 in the future they will be available in significantly larger amounts, preferably on a cost-effective 
basis. Because most genes can be expressed in many different systems, it is essential to 
determine which system offers the most advantages for the manufacture of the recombinant 
protein. The ideal expression system would be one that produces a maximum amount of safe, 
biologically active material at a minimum cost. The use of modified mammalian cells with 

10 recombinant DNA techniques has the advantage of resulting in products, which are closely 
related to those of natural origin; however, culturing of these cells is intricate and can only be 
carried out on limited scale. The use of microorganisms such as bacteria permits manufacture on 
a larger scale, but introduces the disadvantage of producing products, which differ appreciably 
from the products of natural origin. For example, proteins that are usually glycosylated in 

1 5 humans are not glycosylated by bacteria. Furthermore, human proteins that are expressed at high 
levels in E. coli frequently acquire an unnatural conformation, accompanied by intracellular 
precipitation due to lack of proper folding and disulfide bridges. Production of recombinant 
proteins in plants has many potential advantages for generating biopharmaceuticals relevant to 
clinical medicine. These include the following; (1) plant systems are more economical than 

20 industrial facilities using fermentation systems; (ii) technology is available for harvesting and 
processing plants/ plant products on a large scale; (iii) elimination of the purification requirement 
when the plant tissue containing the recombinant protein is used as a food (edible vaccines); (iv) 
plants can be directed to target proteins into stable, intracellular compartments as chloroplasts, or 
expressed directly in chloroplasts; (v) the amount of recombinant product that can be produced 

25 approaches industrial-scale levels; and (vi) health risks due to contamination with potential 
human pathogens/toxins are minimized. 

It has been estimated that one tobacco plant should be able to produce more recombinant 
protein than a 300-liter fermenter of E, coli (Crop Tech, VA). In addition, a tobacco plant 

30 produces a million seeds, facilitating large-scale production. Tobacco is also an ideal choice 
because of its relative ease of genetic manipulation and an impending need to explore alternate 
uses for this hazardous crop. However, with the exception of enzymes (e.g. phytase), levels of 
foreign proteins produced in nuclear transgenic plants are generally low, mostly less than 1% of 
the total soluble protein (Kusnadi et al. 1997), May et aL (1996) discuss this problem using the 

35 following examples. Although plant derived recombinant hepatitis B surface antigen was as 
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x>mmercial recombinant vaccine, the levels of expression in transgenic tobacco 
were low (0.0066% of total soluble protein). Even though Norwalk virus capsid protein 
expressed in potatoes caused oral immunization when consumed as food (edible vaccine), 
expression levels were low (0.3% of total soluble protein). In particular, expression of human 
5 proteins in nuclear transgenic plants has been disappointingly Low: e.g. human Interferon-p 
0.000017% of fresh weight, human serum albumin 0.02% and erythropoietin 0.0026% of total 
soluble protein (see table 1 in Kusnadi et ah 1997). A synthetic gene coding. for the human 
epidermal growth factor was expressed only up to 0.001% of total soluble protein in transgenic 
tobacco (May et al. 1996). The cost of producing recombinant proteins in alfalfa leaves was 

10 estimated to be 12-fold lower than in potato tubers and comparable with seeds (Kusnadi et al. 
1997). However, tobacco leaves are much larger and have much higher biomass than alfalfa. 
Planet Biotechnology has recently estimated that at 50 mg/liter of mammalian cell culture or 
transgenic goat's milk or 50mg/kg of tobacco leaf expression, the cost of purified IgA will be 
$10,000, 1000 and 50/g, respectively (Danieli et al. 2000). The cost of production of 

1 5 recombinant proteins will be 50-fold lower than that of E.coli fermentation (with 20% expression 
levels in E.coli (Kusnadi et al. 1997). A decrease in insulin expression from 20% to 5% of 
biomass doubled the cost of production in E. coli (Petridis et al. 1995). Expression level less than 
1% of total soluble protein in plants has been found to be not commercially feasible (Kusnadi et 
al. 1997). Therefore, it is important to increase levels of expression of recombinant proteins in 

20 plants in order to exploit plant production of pharmacologically important proteins. 

An alternate approach is to express foreign proteins in chloroplasts of higher plants. We 
have recently integrated foreign genes (up to 10,000 copies per cell) into the tobacco chloroplast 
genome resulting in accumulation of recombinant proteins up to 46% of the total soluble protein 

25 (De Cosa et al. 2001). Chloroplast transfonnation utilizes two flanking sequences that, through 
homologous recombination, insert foreign DNA into the spacer region between the functional 
genes of the chloroplast genome, thus targeting the foreign genes to a precise location. This 
eliminates the "position effecf 5 and gene silencing frequently observed in nuclear transgenic 
plants. Chloroplast genetic engineering is an environmentally friendly approach, m i nimizing 

30 concerns of out-cross of introduced traits via pollen to weeds or other crops (Bock and 
Hagemann 2000, Heifetz 2000). Also, the concerns of insects developing resistance to 
biopesticides are rmnimized by hyper-expression of single insecticidal proteins (high dosage) or 
expression of different types of insecticides in a single transformation event (gene pyramiding). 
Concerns of insecticidal proteins on non-target insects are minimized by lack of expression in 

35 transgenic pollen (De Cosa et al. 2001). 
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Most importantly, a significant advantage in the production of pharmaceutical proteins in 
chloroplasts is their ability to process eukaryotic proteins, including folding and formation of 
disulfide bridges (Drescher et al. 1998). Chaperonin proteins are present in chloroplasts (Roy, 
5 1989; Vierling, 1991) that function in folding and assembly of prokaryotic/eukaryotic proteins. 
Also, proteins are activated by disulfide bond oxido/reduction cycles using the chloroplast 
thioredoxin system (Reulland and Miginiac-Maslow, 1999) or chloroplast protein disulfide 
isomerase (Kim and Mayfield, 1997). Accumulation of fully assembled, disulfide bonded form 
of human somatotropin via chloroplast transformation (Staub et al. 2000), oligomeric form of 

10 CTB (Henriques and Daniell, 2000) and the assembly of heavy/light chains of humanized Guy's 
13 antibody in transgenic chloroplasts (Panchal et al. 2000) provide strong evidence for 
successful processing of pharmaceutical proteins inside chloroplasts. Such folding and assembly 
should eliminate the need for highly expensive in vitro processing of pharmaceutical proteins. 
For example, 60% of the total operating cost in the production of human insulin is associated 

15 with in vitro processing (formation of disulfide bridges and cleavage of methionine, Petridis et 
al. 1995), 

Another major cost of insulin production is purification; chromatography accounts for 
30% of operating expenses and 70% of equipment in production of insulin (Petridis et al. 1995). 

20 Therefore, new approaches are necessary to minimize or eliminate chromatography in insulin 
production. One such approach is the use of GVGVP as a fusion protein to facilitate single step 
purification without the use of chromatography. GVGVP is a Protein Based Polymer (PBP) 
made from synthetic genes; at lower temperatures this polymer exists as more extended 
molecules. Upon raising the temperature above the transition range, polymer hydrophobically 

25 folds into dynamic structures called 0-spirals that further aggregate by hydrophobic association 
to form twisted filaments (Urry, 1991; Urry et al., 1994). Inverse temperature transition offers 
several advantages. It facilitates scale up of purification from grams to kilograms. Milder 
purification condition requires only a modest change in temperature and ionic strength. This 
should also facilitate higher recovery, faster purification and high volume processing; protein 

30 purification is generally the slow step (bottleneck) in pharmaceutical product development 
Through exploitation of this reversible inverse temperature transition property, simple and 
inexpensive extraction and purification may be performed. The temperature at which the 
aggregation takes place can be manipulated by engineering biopolymers containing varying 
numbers of repeats and changing salt concentration in solution (McPherson et al., 1996). 

35 Chloroplast mediated expression of insulin-polymer fusion protein should eliminate the need for 
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fermentation process as well as reagents needed for recombinant protein 
purification and downstream processing. 

Oral delivery of insulin is yet another powerful approach that would eliminate 97% of the 
5 production cost of insulin (Petridis et al. 1995). For example, Sun et al. (1994) have shown that 
feeding a small dose of antigens conjugated to the receptor binding non-toxic B subunit moiety 
of the cholera toxin (CTB) suppressed systemic T cell-mediated inflammatory reactions in 
animals. Oral administration of a myelin antigen conjugated to CTB has been shown to protect 
animals against encephalomyelitis, even when given after disease induction (Sun et al. 1996). 

10 Bergerot et al. (1997) reported that feeding small amounts of human insulin conjugated to CTB 
suppressed beta cell destruction and clinical diabetes in adult non-obese diabetic (NOD) mice. 
The protective effect could be transferred by T cells from CTB-insulin treated animals and was 
associated with reduced insulitis. These results demonstrate that protection against autoimmune 
diabetes can indeed be achieved by feeding small amounts of a pancreas islet cell auto antigen 

15 linked to CTB (Bergerot et al. 1997). Conjugation with CTB facilitates antigen delivery and 
presentation to the Gut Associated Lymphoid Tissues (GALT) due to its affinity for the cell 
surface receptor GMi-ganglioside located on GALT cells, for increased uptake and immunologic 
recognition (Arakawa et al. 1998). Transgenic potato tubers expressed up to 0.1% CTB-insulin 
fusion protein of total soluble protein, which retained GMi-ganglioside binding affinity and 

20 native autogenicity for both CTB and insulin. NOD mice fed with transgenic potato tubers 
containing microgram quantities of CTB-insulin fusion protein showed a substantial reduction in 
insulitis and a delay in the progression of diabetes (Arkawa et al. 1998). However, for 
commercial exploitation, the levels of expression should be increased in transgenic plants. 
Therefore, we propose here expression of CTB-insulin fusion in transgenic chloroplasts of 

25 nicotine free edible tobacco to increase levels of expression adequate for animal testing. 

Taken together, low levels of expression of human proteins in nuclear transgenic plants, 
and difficulty in folding, assembly/processing of human proteins in E.coli should make 
chloroplasts an alternate compartment for expression of these proteins; production of human 

30 proteins in transgenic chloroplasts should also dramatically lower the production cost. Large- 
scale production of insulin in tobacco in conjunction with an oral delivery system should be a 
powerful approach to provide treatment to diabetes patients at an affordable cost and provide 
tobacco farmers alternate uses for this hazardous crop. Therefore, the first objective of this 
project is to use poly(GVGVP) as a fusion protein to enable hyper-expression of insulin and 

35 accomplish rapid one step purification of the fusion peptide utilizing the inverse temperature 
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erties of this polymer. The second objective is to develop insulin-CTB fusion 
protein for oral delivery in nicotine free edible tobacco (LAMD 605). 

Both objectives will be accomplished as follows: 
5 a) Develop recombinant DNA vectors for enhanced expression of Proinsulin as fusion proteins 
with GVGVP or CTB via chloroplast genomes of tobacco 

b) Obtain transgenic tobacco (Petit Havana & LAMD 605) plants 

c) Characterize transgenic expression of proinsulin polymer or CTB fusion proteins using 
molecular and biochemical methods in chloroplasts 

10 d) Employ existing or modified methods of polymer purification from transgenic leaves 

e) Analyze Mendelian or maternal inheritance of transgenic plants 

f) Large scale purification of insulin and comparison of current insulin purification methods 
with polymer-based purification method in E.coli and tobacco 

g) Compare natural refolding in chloroplasts with in vitro processing 

15 h) Characterization (yield and purity) of proinsulin produced in E. coli and transgenic tobacco 
i) Assessment of diabetic symptoms in mice fed with edible tobacco expressing CTB-insulin 
fusion protein. 

b. BACKGROUND AND SIGNIFICANCE 

20 Diabetes and Insulin: The most obvious action of insulin is to lower blood glucose (Oakly et al. 
1973). This is a result of its immediate effect in increasing glucose uptake in tissues. In muscle, 
under the action of insulin, glucose is more readily taken up and either converted to glycogen 
and lactic acid or oxidized to carbon dioxide. Insulin also affects a number of important enzymes 
concerned with cellular metabolism. It increases the activity of glucokinase, which 

25 phosphorylates glucose thereby increasing the rate of glucose metabolism in the liver. Insulin 
also suppresses ghiconeogenesis by depressing the function of liver enzymes, which operate the 
reverse pathway from proteins to glucose. Lack of insulin can restrict the transport of glucose 
into muscle and adipose tissue. This results in increases in blood glucose levels (hyperglycemia). 
In addition, the breakdown of natural fat to free fatty acids and glycerol is increased and there is 

30 a rise in the fatty acid content in the blood. Increased catabolism of fatty acids by the liver results 
in greater production of ketone bodies. They diffuse from the liver and pass to the muscles for 
further oxidation. Soon, ketone body production rate exceeds oxidation rate and ketosis results. 
Less amino acids are taken up by the tissues and protein degradation results. At the same time 
ghiconeogenesis is stimulated and protein is used to produce glucose. Obviously, lack of insulin 

35 has serious consequences. 
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Diabetes is classified into types I and H. Type I is also known as insulin dependent 
diabetes mellitus (IDDM). Usually this is caused by a cell-mediated autoimmune destruction of 
the pancreatic P-cells (Davidson, 1998). Those suffering from this type are dependent on 
5 external sources of insulin. Type II is known as noninsulin-dependent diabetes mellitus 
(NEDDM). This usually involves resistance to insulin in combination with its underproduction. 
These prominent diseases have led to extensive research into microbial production of 
recombinant human insulin (rHI). 

10 Expression of Recombinant Human Insulin in E.coli: In 1978, two thousand kilograms of 
insulin were used in the world each year, half of this was used in the United States (Steiner et al., 
1978). At that time, the number of diabetics in the US were increasing 6% every year (Gunby, 
1978). In 1997-98, 10% increase in sales of diabetes care products and 19% increase in insulin 
products have been reported by Novo Nordisk (world's leading supplier of insulin), making it a 

15 7.8 billion dollar industry. Annually, 160,000 Americans are killed by diabetes, making it the 
fourth leading cause of death. Many methods of production of rHI have been developed. Insulin 
genes were first chemically synthesized for expression in Esherichia coli (Crea et al., 1978). 
These genes encoded separate insulin A and B chains. The genes were each expressed in E. coli 
as fusion proteins with the ^-galactosidase (Goeddel et al., 1979). The first documented 

20 production of rHI using this system was reported by David Goeddel from Gcnentech (Hall, 
1988). The genes were fused to the Trp synthase gene, which resulted is increased insulin yield, 
due to the smaller fusion peptide. This fusion protein was approved for commercial production 
by Eli Lilly in 1982 (Chance and Frank, 1993) with a product name of Humulin. As of 1986, 
Humulin was produced from proinsulin genes. Proinsulin contains both insulin chains and the C- 

25 peptide that connects them. Normal in vitro post-translational processing of proinsulin includes 
use of trypsin and carboxypeptidase B for maturation to insulin. Other data concerning 
commercial production of Humulin and other insulin products is now considered proprietary 
information and is not available to the public. 

30 Protein Based Polymers (PBP): The synthetic gene that codes for a bioelastic PBP was 
designed after repeated amino acid sequences GVGVP, observed in all sequenced mammalian 
elastin proteins (Yen et al. 1987). Elastin is one of the strongest known natural fibers and is 
present in skin, ligaments, and arterial walls. Bioelastic PBPs containing multiple repeats of this 
pentamer have remarkable elastic properties, enabling several medical • and non-medical 

35 applications (Urry et al. 1993, Urry 1995, Daniell 1995). GVGVP polymers prevent adhesions 
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;ery, aid in reconstructing tissues and delivering drugs to the body over an 
extended period of time. North American Science Associates, Inc. reported that GVGVP 
polymer is non-toxic in mice, non-sensitizing and non-antigenic in guinea pigs, and non- 
pyrogenic in rabbits (Urry et aL 1993). Researchers have also observed that inserting sheets of 
5 GVGVP at the sites of contaminated wounds in rats reduces the number of adhesions that form 
as the wounds heal (Urry et al. 1993). In a similar manner, using the GVGVP to encase muscles 
that are cut during eye surgery in rabbits prevents scarring following the operation (Uny et al. 
1993, Uny 1995). Other medical applications of bioelastic PBPs include tissue reconstruction 
(synthetic ligaments and arteries, bones), wound coverings, artificial pericardia, catheters and 
1 0 programmed drug delivery (Urry, 1995; Urry et al., 1993, 1996). 

We have expressed the elastic PBP (GVGVP)i 2 i in E. coli (Guda et al. al. 1995, Brixey et 
al. 1997), in the fungus Aspergillus nidulans (Herzog et al. 1997), in cultured tobacco cells 
(Zhang et al. 1995), and in transgenic tobacco plants (Zhang et al. 1996). In particular, 

15 (GVGVP)i2i has been expressed to such high levels in E. coli that polymer inclusion bodies 
occupied up to about 90% of the cell volume; also, inclusion bodies have been observed in 
chloroplasts of transgenic tobacco plants (see attached article, Daniell and Guda, 1997). 
Recently, we reported stable transformation of the tobacco chloroplasts by integration and 
expression the biopolymer gene (EG121), into the Large Single Copy region (5,000 copies per 

20 cell) or the Inverted Repeat region (10, 000 copies per cell) of the chloroplast genome (Guda et 
aL, 2000). 

PBP as Fusion Proteins: Several systems are now available to simplify protein purification 
including the maltose binding protein (Marina et al. 1988), glutothione S-transferase (Smith and 

25 Johnson, 1988), biotinylated (Tsao et al. 1996), thioredoxin (Smith et al. 1998) and cellulose 
binding (Ong et al. 1989) proteins. In order to effectively utilize aforementioned fusion proteins 
in the purification process, recombinant DNA vectors for fusion with short peptides are now 
available (Smith et al. 1988; Kim and Raines, 1993; Su et al. 1992). Recombinant proteins are 
generally purified by affinity chromatography, using ligands specific to carrier proteins (Nilsson 

30 et aL 1997). While these are useful techniques for laboratory scale purification, affinity 
chromatography for large-scale purification is time consuming and cost prohibitive. Therefore, 
economical and non-chromatographic techniques are highly desirable. In addition, a common 
solution to N-terminal degradation of small peptides is to fuse foreign peptides to endogenous E. 
coli proteins. Early in the development of this technique, 0-galactosidase (0-gal) was used as a 

35 fusion protein (Goldberg and Goff, 1986). A drawback of this method was that the [3-gal protein 
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high molecular weight (MW 100,000). Therefore, the proportion of the peptide 
product in the total protein is low. Another problem associated with the large P-gal fusion is 
early termination of translation (Burnette, 1983; Hall, 1988). This occurred when p-gal was used 
to produce human insulin peptides because the fusion was detached from the ribosome during 
5 translation thus yielding incomplete peptides. In order to increase the peptide production, other 
proteins of lower molecular weight proteins have been used as fusion proteins. For example, 
better yields were obtained with the tryptophan synthase (190aa) fusion proteins (Hall, 1988, 
Burnett, 1983). 

1 0 One of the primary goals of this study is to use poly(GVGVP) as a fusion protein to 

enable hyper-expression of insulin and accomplish rapid one step purification of the fusion 
peptide. At lower temperatures the polymers exist as more extended molecules which, on raising 
the temperature above the transition range hydrophobically fold into dynamic structures called 0- 
spirals that further aggregate by hydrophobic association to form twisted filaments (Urry, 1991). 

15 Through exploitation of this reversible property, simple and inexpensive extraction and 
purification is performed. The temperature at which aggregation takes place (TO can be 
manipulated by engineering biopolymers containing varying numbers of repeats or changing salt 
concentration (McPherson et al., 1996). Another group has recently demonstrated purification of 
recombinant proteins by fusion with thermally responsive polypeptides (Meyer and Chilkoti, 

20 1999). Polymers of different sizes have been synthesized and expressed in E.coli in the Pi's 
laboratory. This approach would also eliminate the need for expensive reagents, equipment and 
time required for purification. 

Cholera Toxin 3 subunit as a fusion protein: Vibrio cholerae causes diarrhea by colonizing the 
25 small intestine and producing enterotoxins, of which the cholera toxin (CT) is considered the 
main cause of toxicity. CT is a hexameric AB 5 protein having one 27KDa A subunit which has 
toxic ADP-ribosyl transferase activity and a non-toxic pentamer of 1 1.6 kDa B subunits that are 
non-covalently linked into a very stable doughnut like structure into which the toxic active (A) 
subunit is inserted. The A subunit of CT consists of two fragments - Al and A2 which are linked 
30 by a disulfide bond. The enzymatic .activity of CT is located solely on the Al fragment (Gill, 
1976). The A2 fragment of the A subunit links the Al fragment and the B pentamer. CT binds 
via specific interactions of the B subunit pentamer with GM1 ganglioside, the membrane 
receptor, present on the intestinal epithelial cell surface of the host The A subunit is then 
translocated into the cell where it ADP-ribosylates the Gs subunit of adenylate cyclase bringing 
35 about the increased levels of cyclic AMP in affected cells that is associated with the electrolyte 
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of clinical cholera (Lebens et al. 1994). For optimal enzymatic activity, the Al 
fragment needs to be separated from the A2 fragment by proteolytic cleavage of the main chain 
and by reduction of the disulfide bond linking them (Mekalanos et al. 1979). 

5 The Expression and assembly of CTB in transgenic potato tubers has been reported 

(Arakawa et al.1997). The CTB gene including the leader peptide was fused to an endoplasmic 
reticulum retention signal (SEKDEL) at the 3' end to sequester the CTB protein within the 
lumen of the ER. The DNA fragment encoding the 21 -ammo acid leader peptide of the CTB 
protein was retained in order to direct the newly synthesized CTB protein into the lumen of the 
10 ER. Immunoblot analysis indicated that the plant derived CTB protein was antigenically 
i indistinguishable from the bacterial CTB protein and that oligomeric CTB molecules (Mr ~ 50 
kDa) were the dominant molecular species isolated from transgenic potato leaf and tuber tissues. 
Similar to bacterial CTB, plant derived CTB dissociated into monomers (Mr~15 kDa) during 
heat/acid treatment. 

15 

Enzyme linked immunosorbent assay methods indicated that plant synthesized CTB 
protein bound specifically to GM1 gangliosides, the natural membrane receptors of Cholera 
Toxin. The maximum amount of CTB protein detected in auxin induced transgenic potato leaf 
and tuber tissues was approximately 0.3% of the total soluble protein. The oral immu n ization of 

20 CD-I mice with transgenic potato tissues transformed with the CTB gene (administered at 
weekly intervals for a month with a final booster feeding on day 65) has also been reported. The 
levels of serum and mucosal anti-cholera toxin antibodies in mice were found to generate 
protective immunity against the cytopathic effects of CT holotoxin. Following intraileal injection 
with CT, the plant immunized mice showed up to a 60% reduction in diarrheal fluid 

25 accumulation in the small intestine. Systemic and mucosal CTB- specific antibody titers were 
determined in both serum and feces collected from immunized mice by the class-specific 
chemiluminescent ELESA method and the endpoint titers for the three antibody isotypes ( 
IgMJgG and IgA) were determined. The extent of CT neutralization in both Vero ceil and ileal 
loop experiments suggested that anti-CTB antibodies prevent CT binding to cellular GMI- 

30 gangliosides. Also, mice fed with 3 g of transgenic potato exhibited similar intestinal protection 
as mice gavaged with 30Dg of bacterial CTB. Recombinant LTB [rLTB] (the heat labile 
enterotoxin produced by Enterotoxigenic Rcoli) which is structurally, functionally and 
immunologically similar to CTB was expressed in transgenic tobacco (Arntzen et aL 1998; Haq 
et al. 1995). They have reported that, the rLTB retained its antigenicity as shown by 

35 immunoprecipitation of rLTB with antibodies raised to rLTB from E.colL The rLTB protein was 
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loleculax weight and aggregated to form the peotamer as confirmed by gel 
permeation chromatography. 

Delivery of Human Insulin: Insulin has been delivered intravenously in the past several years. 
5 However, more recently, alternate methods such as nasal spray, are also available. Oral delivery 
of insulin is yet another new approach (Mathiowitz et al., 1997). Engineered, polymer 
microspheres made of biologically erodable polymers, which display strong interactions with 
gastrointestinal mucus and cellular linings, can traverse both mucosal absorptive epithelium and 
the follicle-associated epithelium, covering the lymphoid tissue of Peyer*s patches. Polymers 
10 maintain contact with intestinal epithelium for extended periods of time and actually penetrate 
through and between cells. Animals fed with the poly(FA: PLGA)-encapsulated insulin 
preparation were able to regulate the glucose load better than controls, confirming that insulin 
crossed the intestinal barrier and was released from the microspheres in a biologically active 
fonn (Mathiowitz et al., 1997). 

15 

Besides, CTB has also been demonstrated to be an effective carrier molecule for the 
induction of mucosal immunity to polypeptides to which it is chemically or genetically 
conjugated (McKenzie et al. 1984; Dertzbaugh et al 1993) The production of 
immunomodulatory transmucosal carrier molecules, such as CTB, in plants may greatly improve 

20 the efficacy of edible plant vaccines (Haq et al. 1995; Thanavala et al. 1995; Mason et al. 1996) 
and may also provide novel oral tolerance agents for prevention of such autoimmune diseases as 
Type I diabetes (Zhang et al. 1991), Rheumatoid arthritis (Trentham et al. 1993),multiple 
sclerosis( Khoury et al. 1990; Miller et al. 1992; Weiner et al. 1993) as well as the prevention of 
allergic and allograft rejection reactions (Sayegh et al. 1992; Hancock et al. 1993). Therefore, 

25 expressing a CTB-proinsulin fusion would be an ideal approach for oral delivery of insulin, 

Chloroplast Genetic Engineering: When we developed the concept of chloroplast genetic 
engineering (Daniell and McFadden, 1988 U.S. Patents; Daniell, World Patent, 1999), it was 
possible to introduce isolated intact chloroplasts into protoplasts and regenerate transgenic plants 

30 (Carlson, 1973). Therefore, early investigations on chloroplast transformation focused on the 
development of in organello systems using intact chloroplasts capable of efficient and prolonged 
transcription and translation (Daniell and Rebeiz, 1982; Daniell et al., 1983, 1986) and 
expression of foreign genes in isolated chloroplasts (Daniell and McFadden, 1987). However, 
after the discovery of the gene gun as a transformation device (Daniell, 1993), it was possible to 

35 transform plant chloroplasts without the use of isolated plastids and protoplasts. Chloroplast 
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sring was accomplished in several phases. Transient expression of foreign genes 
in plastids of dicots (Daniell et al, 1990; Ye et al., 1990) was followed by such studies in 
monocots (Daniell et al., 1991). Unique to the chloroplast genetic engineering is the development 
of a foreign gene expression system using autonomously replicating chloroplast expression 
5 vectors (Daniell et al., 1990). Stable integration of a selectable marker gene into the tobacco 
chloroplast genome (Svab and Maliga, 1993) was also accomplished using the gene gun. 
However, useful genes conferring valuable traits via chloroplast genetic engineering have been 
demonstrated only recently. For example, plants resistant to B.t sensitive insects were obtained 
by integrating the crylAc gene into the tobacco chloroplast genome (McBride et al., 1995). Plants 

10 resistant to B.t. resistant insects (up to 40,000 fold) were obtained by hyper-expression of the 
cryllA gene within the tobacco chloroplast genome (Kota et al., 1999). Plants have also been 
genetically engineered via the chloroplast genome to confer herbicide resistance and the 
introduced foreign genes were maternally inherited, overcoming the problem of out-cross with 
weeds (Daniell et al., 1998). Chloroplast genetic engineering has also been used to produce 

15 pharmaceutical products that are not used by plants (Staub et al. 2000, Guda et al. 2000). 
Chloroplast genetic engineering technology is currently being applied to other useful crops 
(Sidorovetal. 1999; Daniell, 1999). 

c. PRELIMINARY STUDIES 

20 A remarkable feature of chloroplast genetic engineering is the observation of 

exceptionally large accumulation of foreign proteins in transgenic plants, as much as 46% of 
CRY protein in total soluble protein, even in bleached old leaves (DeCosa et al. 2001). Stable 
expression of a pharmaceutical protein in chloroplasts was first reported for GVGVP, a protein 
based polymer with varied medical applications (such as the prevention of post-surgical 

25 adhesions and scars, wound coverings, artificial pericardia, tissue reconstruction and 
programmed drug delivery (Gtida et al. 2000). Subsequently, expression of the human 
somatotropin via the tobacco chloroplast genome (Staub et al. 2000) to high levels (7% of total 
soluble protein) was observed. The following investigations that are in progress in the Daniell 
lab illustrate the power of this technology to express small peptides, entire operons, vaccines that 

30 require oligomeric proteins with stable disulfide bridges and monoclonals that require assembly 
of heavy/light chains via chaperonins. In order for edible insulin approach to be successful, it is 
essential to develop a selection system tree of antibiotic resistant genes. One such marker tree 
chloroplast transformation system has been accomplished in this laboratory (Daniell et al. 2000). 
Experiments are in progress to develop chloroplast transformation of edible leaves (alfalfa and 

35 lettuce) for the practical applications of this approach. 
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Engineering novel pathways via the chloroplast genome: In plant and animal cells, nuclear 
mRNAs are translated monocistronically. This poses a serious problem when engineering 
multiple genes in plants (Bogorad, 2000). Therefore, in order to express the polyhydroxybutyrate 

5 polymer or Guy's 13 antibody, single genes were first introduced into individual transgenic 
plants, then these plants were back-crossed to reconstitute the entire pathway or the complete 
protein (Navrath et al. 1994; Ma et al. 1995). Similarly, in a seven year long effort, Ye et al. 
(2000) recently introduced a set of three genes for a short biosynthetic pathway that resulted in 
0-carotene expression in rice. In contrast, most chloroplast genes of higher plants are 

0 cotranscribed (Bogorad, 2000). Expression of polycistrons via the chloroplast genome provides a 
unique opportunity to express entire pathways in a single transformation event We have recently 
used the Bacillus thuringiensis (Bt) cry2Aa2 operon as a model system to demonstrate operon 
expression and crystal formation via the chloroplast genome (De Cosa et al. 2001). CiylAsl is 
the distal gene of a three-gene operon. The orf immediately upstream of cry2Aa2 codes for a 

5 putative chaperonin that facilitates the folding of cry2Aa2 (and other proteins) to form 
proteolytically stable cuboidal crystals (Ge et al. 1998). 

Therefore, the cry2Aa2 bacterial operon was expressed in tobacco chloroplasts to test the 
resultant transgenic plants for increased expression and improved persistence of the accumulated 

0 insecticidal protein(s). Stable foreign gene integration was confirmed by PCR and Southern blot 
analysis in T 0 and T! transgenic plants. Cry2Aa2 operon derived protein accumulated at 45.3% of 
the total soluble protein in mature leaves and remained stable even in old bleached leaves 
(46.1%)(Figure 1). This is the highest level of foreign gene expression ever reported in 
transgenic plants. Exceedingly uncontrollable insects (10-day old cotton bollwonn, beetarmy 

5 worm) were killed 100% after consuming transgenic leaves. Electron micrographs showed the 
presence of the insecticidal protein folded into cuboidal crystals similar in shape to Cry2Aa2 
crystals observed in Bacillus thuringiensis (Figure 2). In contrast to currently marketed 
transgenic plants with soluble CRY proteins, folded protoxin crystals will be processed only by 
target insects that have alkaline gut pH; this approach should improve safety of Bt transgenic 

0 plants. Absence of insecticidal proteins in transgenic pollen eliminates toxicity to non-target 
insects via pollen. In addition to these environmentally friendly approaches, this observation 
should serve as a model system for large-scale production of foreign proteins within chloroplasts 
in a folded configuration enhancing their stability and facilitating single step purification. This is 
the first demonstration of expression of a bacterial operon in transgenic plants and opens the 

5 door to engineer novel pathways in plants in a single transformation event 
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Expressing small peptides via the chloroplast genome: It is common knowledge that the 
medical community has been fighting a vigorous battle against diug resistant pathogenic bacteria 
for years. Cationic antibacterial peptides from mammals, amphibians and insects have gained 
5 more attention over the last decade (Hancock and Lehrer, 1998). Key features of these cationic 
peptides are a net positive charge, an affinity for negatively-charged prokaryotic membrane 
phospholipids over neutral-charged eukaryotic membranes and the ability to form aggregates that 
disrupt the bacterial membrane (Biggin and Sansom, 1999). 

10 There are three major peptides with a-helical structures, cecropin from Hyalophora 

cecropia (giant silk moth), magainins from Xenopus laevis (African frog) and defensins from 
mammalian neutrophils. Magainin and its analogues have been studied as a broad-spectrum 
topical agent, a systemic antibiotic; a wound-healing stimulant; and an anticancer agent (Jacob 
and Zasioff, 1994). We have recently observed that a synthetic lytic peptide (MSI-99, 22 amino 

1 5 acids) can be successfully expressed in tobacco chloroplast (DeGray et al. 2000). The peptide 
retained its lytic activity against the phytopathogenic bacteria Pseudomonas syringae and 
multidrug resistant human pathogen, Pseudomonas aeruginosa. The anti-microbial peptide 
(AMP) used in this study was an amphipathic alpha-helix molecule that has an affinity for 
negatively charged phospholipids commonly found in the outer-membrane of bacteria. Upon 

20 contact with these membranes, individual peptides aggregate to form pores in the membrane, 
resulting in bacterial lysis. Because of the concentration dependent action of the AMP, it was 
expressed via the chloroplast genome to accomplish high dose delivery at the point of infection. 
PCR products and Southern blots confirmed chloroplast integration of the foreign genes and 
homoplasmy. Growth and development of the transgenic plants was unaffected by hyper- 

25 expression of the AMP within chloroplasts. In vitro assays with To and T\ plants confirmed that 
the AMP was expressed at high levels (21.5 to 43% of the total soluble protein) and retained 
biological activity against Pseudomonas syringae, a major plant pathogen. In situ assays resulted 
in intense areas of necrosis around the point of infection in control leaves, while transformed 
leaves showed no signs of necrosis (200-800 \ig of AMP at the site of infection)(Figure 3). Ti in 

30 vitro assays against Pseudomonas aeruginosa (a multi-drug resistant human pathogen) displayed 
a 96% inhibition of growth (Figure 4). These results give a new option in the battle against 
phytopathogenic and drug-resistant human pathogenic bacteria. Small peptides (like insulin) are 
degraded in most organisms. However, stability of this AMP in chloroplasts opens up this 
compartment for expression of hormones and other small peptides. 

35 
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d assembly of monocl nals in transgenic chloroplasts: Dental caries (cavities) 
is probably the most prevalent disease of humankind. Colonization of teeth by S. mutans is the 
single most important risk factor in the development of dental caries. S. mutans is a non-motile, 
gram positive coccus. It colonizes tooth surfaces and synthesizes ghicans (insoluble 
polysaccharide) and fructans from sucrose using the enzymes glucosyltransferase and 
fructosyltransferase respectively (Hotz et al. 1972). The glucans play an important role by 
allowing the bacterium to adhere to the smooth tooth surfaces. After its adherence, the bacterium 
ferments sucrose and produces lactic acid Lactic acid dissolves the minerals of the tooth, 
producing a cavity. 



10 



A topical monoclonal antibody therapy to prevent adherence of S. mutans to teeth has 
recently been developed. The incidence of cariogenic bacteria (in humans and animals) and 
dental caries (in animals) was dramatically reduced for periods of up to two years after the 
cessation of the antibody therapy. No adverse events were detected either in the exposed animals 

15 or in human volunteers (Ma et al. 1998). The annual requirement for this antibody in the US 
alone may eventually exceed 1 metric ton. Therefore, this antibody was expressed via the 
chloroplast genome to achieve higher levels of expression and proper folding (Panchal et al. 
2000). The integration of antibody genes into the chloroplast genome was confirmed by PCR and 
Southern blot analysis. The expression of both heavy and light chains was confirmed by western 

20 blot analysis undo: reducing conditions (Figure 7A,B). The expression of fully assembled 
antibody was confirmed by western blot analysis under non-reducing conditions (Figure 7C). 
This is the first report of successful assembly of a multi-subunit human protein in transgenic 
chloroplasts. Production of monoclonal antibodies at agricultural level should reduce their cost 
and create new applications of monoclonal antibodies. 

25 

Marker free chloroplast transgenic plants Most transformation techniques co-introduce a gene 
that confers antibiotic resistance, along with the gene of interest to impart a desired trait 
Regenerating transformed cells in antibiotic containing growth media permits selection of only 
those cells that have incorporated the foreign genes. Once transgenic plants are regenerated, 

30 antibiotic resistance genes serve no useful purpose but they continue to produce their gene 
products. One among the primary concerns of genetically modified (GM) crops is the presence 
of clinically important antibiotic resistance gene products in transgenic plants that could 
inactivate oral doses of the antibiotic (reviewed by Puchta 2000; Daniell 1999A). Alternatively, 
the antibiotic resistant genes could be transferred to pathogenic microbes in the gastrointestinal 

35 tract or soil rendering them resistant to treatment with such antibiotics. Antibiotic resistant 
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e of the major challenges of modem medicine, In Germany, GM crops containing 
antibiotic resistant genes have been banned from release (Peerenboom 2000). 

Chloroplast genetic engineering offers several advantages over nuclear transformation 
5 . including high levels of gene expression and gene containment but utilizes thousands of copies 
of the most commonly used antibiotic resistance genes. Engineering genetically modified (GM) 
crops without the use of antibiotic resistance genes should eliminate potential risk of their 
transfer to the environment or gut microbes. Therefore, betaine aldehyde dehydrogenase 
(BADH) gene from spinach is used in this study as a selectable marker (Daniell et al. 2000). The 

10 selection process involves conversion of toxic betaine aldehyde (BA) by the chloroplast BADH 
enzyme to nontoxic glycine betaine, which also serves as an osmoprotectant. Chloroplast 
transformation efficiency was 25 fold higher in B A selection than spectinomycin, in addition to 
rapid regeneration (Table 1). Transgenic shoots appeared within 12 days in 80% of leaf discs (up 
to 23 shoots per disc) in BA selection compared to 45 days in 15% of discs (1 or 2 shoots per 

15 disc) on spectinomycin selection (Figure 8). Southern blots confrrtn stable integration of foreign 
genes into all of the chloroplast genomes (-10,000 copies per cell) resulting in homoplasmy. 
Transgenic tobacco plants showed 1527-1816% higher BADH activity at different 
developmental stages than untrans formed controls. Transgenic plants were morphologically 
indistinguishable from untransformed plants and the introduced trait was stably inherited in the 

20 subsequent generation. This is the first report of genetic engineering of the chloroplast genome 
without the use of antibiotic selection. Use of genes that are naturally present in spinach for 
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Idition to gene containment, should ease public concerns or perception of GM 
crops. Also, this should be very helpful in the development of edible insulin. 



Expression of cholera toxin p subunit oligomers as a vaccine in chloroplasts: CTB when 
5 administered orally (Lebens and Holmgren, 1994) is a potent mucosal immunogen, which can 
neutralize the toxicity of the CT holotoxin by preventing it from binding to the intestinal cells 
(Mor et al. 1998). This is believed to be a result of it binding to eukaryotic cell surfaces via the 
Gy[i gangliosides, receptors present on the intestinal epithelial surface, thus eliciting a mucosal 
immune response to pathogens (Lipscombe et al. 1991) and enhancing the immune response 
10 when chemically coupled to other antigens (Dertzbaugh and Elson, 1993; Holmgren et al. 1993; 
Nashar et al. 1993; Sun et a!. 1994). 



Cholera toxin (CTB) has previously been expressed in nuclear transgenic plants at levels 
of 0.01 (leaves) to 0.3% (tubers) of the total soluble protein. To increase expression levels, we 

1 5 engineered the chloroplast genome to express the unmodified CTB gene (Henriques and Daniell, 
2000). We observed expression of oligomeric CTB at levels of 4-5% of total soluble plant 
protein (Figure 5A). PCR and Southern Blot analyses confirmed stable integration of the CTB 
gene into the chloroplast genome. Western blot analysis showed that transgenic chloroplast 
expressed CTB was antigenically identical to commercially available purified CTB antigen 

20 (Figure 6). Also, GMl-gang^oside binding assays confirm that chloroplast synthesized CTB 
binds to the intestinal membrane receptor of cholera toxin (Figure 5B). Transgenic tobacco 
plants were morphologically indistinguishable from untransfonned plants and the introduced 
gene was found to be stably inherited in the subsequent generation as confirmed by PCR and 
Southern Blot analyses. The increased production of an efficient transmucosal carrier molecule 

25 and delivery system, like CTB, in chloroplasts of plants makes plant based oral vaccines and 
fusion proteins with CTB needing oral administration, a much more feasible approach. These 
observations establish unequivocally that chloroplasts are capable of forming disulfide bridges to 
assemble foreign proteins, and ideal for expression of CTB fusion proteins. 



30 Polymer-proinsulin Recombinant DNA Vectors: One possible insulin expression system 
involves independent expression of insulin chains A and B, as it has been produced in E.coli for 
commercial purposes in the past, The disadvantage of this method is that E.coli does not form 
disulfide bridges in the cell unless the protein is targeted to the periplasm. Expensive in vitro 
assembly after purification is necessary for this approach. Therefore, a better approach would be 

35 to express the human proinsulin as a polvmer Vision protein. This method is ideal because 
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e capable of forming disulfide bridges. Using a single gene, as opposed to the 
individual chains, would eliminate the necessity of conducting two parallel vector construction 
processes, as is required for the individual chains. In addition, the need for individual 
fermentations and purification procedures is eliminated by the single gene method. Li addition, 
5 proinsulin requires less processing following extraction. 



Recently, the human pre-proinsulin gene was obtained from Genentech, Inc, First the pre- 
proinsulin was sub-cloned into pUC19 to facilitate further manipulations. The next step was to 
design primers to make chloroplast expression vectors. Since we axe interested in proinsulin 

10 expression, the 5 1 primer was designed to land on the proinsulin sequence. This FW primer 
excluded the 69 bases or 23 coded amino acids of the leader or pre-sequence of preproinsulin. 
Also, the forward primer included the enzymatic cleavage site for the protease factor Xa to avoid 
the use of cyanogen bromide. Besides the Xa-factor, a Smal site was introduced to facilitate 
subsequent subcloning. The order of the FW primer sequence is Smal - Xa-factor - Proinsulin 

1 5 gene. The reverse primer included BarnHI and Xbal sites, plus a short sequence with homology 
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9 sequence following the proinsulin gene. The 297bp PCR product (Xa Pris) was 
cloned into pCR2.1. A GVGVP 50-mer was generated as described previously (Daniell et al. 
1997) along with the BBS sequence GAAGGAG. Another Smal partial digestion was performed 
to eliminate the stop codon of the biopolymer gene, decrease the 50mer to a 40mer, and fiise the 
5 40mer to the Xa-proinsulin sequence. Once the correct fragment was obtained by the partial 
digestion of Smal (eliminating the stop codon but including the RBS site), it was ligated to the 
Xa-proinsulin fusion gene resulting in the construct pCR2. 1-40-XaPris. Finally, the biopolymer 
(40mer) - proinsulin fusion gene was subcloned into the chloroplast vector pLD-CtV or pSBL- 
CtV and the orientation was checked in the final vector using suitable restriction sites. 

10 

Expression and Purification of the Biopolymer-proinsulin fusion protein: XL-1 Blue strain 
of E. coli containing pLD-OC-XaPris and the negative controls, which included a plasmid 
containing the gene in the reverse orientation and the E. coli strain without any plasmid were 
grown in TB broth. Cell pellets were resuspended in 500ul of autociaved dEkO or 6M Guanidine 

15 hydrochloride phosphate buffer, pH 7.0 were sonicated and centrifuged at 4°C at 10,000g for 
lOmin. After centrifiigation, the supernatants were mixed with an equal volume of 2XTN buffer 
(100 mM Tris-HCl, pH 8, 100 mM NaCl). Tubes were warmed at 42°C for 25min to induce 
biopolymer aggregation. Then the fusion protein was recovered by centrifuging at 2,500rpm at 
42°C for 3min. Samples were run in a 16.5% Tricine gel, transferred to the nitrocellulose 

20 membrane, and immunoblotting was performed. When the sonic extract is in 6M Guanidine 
Hydrochloride Phosphate Buffer, pH 7.0, the molecular weight changes from its original and 
correct MW 24 kD to a higher MW of approximately 30 kDa (Figure 9 A3). This is probably 
due to the conformation of the biopolymer in this buffer. 

25 The gel was first stained with 0.3M CuCl 2 and then the same gel was stained with 

Commassie R-250 Staining Solution for an hour and men destained for 15min first, and then 
overnight. CuCl2 creates a negative stain (Lee et al. 1987). Polymer proteins (without fusion) 
appear as clear bands against a blue background in color or dark against a light semiopaque 
background (Figure 9A). This stain was used because other protein stains such as Coomassie 

30 Blue R250 does not stain the polymer protein due to the lack of aromatic side chains (McPherson 
. et al., 1992). Therefore, the observation of the 24 kDa protein in R250 stained gel (Figure 9B) is 
due to the insulin fusion with the polymer. This observation was further confirmed by probing 
these blots with the anti-human proinsulin antibody. As anticipated, the polymer insulin fusion 
protein was observed in western blots (Figure lOA^B). Larger proteins observed (Figure 10A-C) 

35 are tetramer and hexazner complexes of proinsulin. It is evident that the insulin-polymer fusion 
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ble in E.coli. Confinning this observation, recently another lab has shown that the 
PBP polymer protein conjugates (with thioredoxin and tendamistat) undergo thermally reversible 
phase transition, retaining the transition behavior of the free polymer (Meyer and Chilkoti, 
1999). These results clearly demonstrate that insulin fusion has not affected the inverse 
5 temperature transition property of the polymer. One of me concerns is the stability of insulin 
at temperatures used for thermally reversible purification. Temperature induced production of 
human insulin has been in commercial use (Schmidt et al. 1999). Also, the temperature transition 
can be lowered by increasing the ionic strength of the solution during purification of this PBP 
(McPherson et al. 1996). Thus, GVGVP-fusion could be used to purify a multitude of 
1 0 economically important proteins in a simple inexpensive step. 

Biopolymer-proinsulin fusion gene expression in chloroplast: As described in section d, 
chloroplast vector was bombarded into the tobacco chloroplast genome via particle 
bombardment (Daniell, 1997). PCR and Southern Blots were performed to confirm biopolymer- 
15 proinsulin fusion gene integration into chloroplast genome. Southern blots show homoplasmy in 
most T 0 lines but a few showed some heteroplasmy (Figure 1 1). Western blots show the 
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polymer proinsuiin fusion protein in all transgenic lines (Figure 10C). 
Quantification using EIJSA is in progress. 

Protease Xa Digestion of the Biopolymer-proinsulin fusion protein and Purification of 
5 Proinsuiin: The enzymatic cleavage of the fusion protein to release the proinsuiin protein from 
the (GVGVP>40 was initiated by adding the factor 10A protease to the purified fusion protein at a 
ratio (w/w) of approximately 1:500. Cleavage of the fusion protein was monitored by SDS- 
PAGE analysis. We detected cleaved proinsuiin in the extracts isolated in 6M guanidine 
hydrochloride buffer (Figure 10A*B). Conditions are now being optimized for complete 
10 cleavage. The Xa protease has been successfully used previously to cleave (GVGVP) 2 o-GST 
fusion (McPherson et al. 1992). 

d. RESEARCH DESIGN AND METHODS 
Evaluation of chloroplast gene expression: A systematic approach to identify and overcome 

15 potential limitations of foreign gene expression in chloroplasts of transgenic plants is essential. 
Information gained in this study should increase the utility of chloroplast transformation system 
by scientists interested in expressing other foreign proteins. Therefore, it is important to 
systematically analyze transcription, RNA abundance, RNA stability, rate of protein synthesis 
and degradation, proper folding and biological activity. For example, the rate of transcription of 

20 the introduced insulin gene will be compared with the highly expressing endogenous chloroplast 
genes (rbcL, psbA, 16S rRNA), using run on transcription assays to determine if the 16SrRNA 
promoter is operating as expected. Transgenic chloroplast containing each of the three constructs 
with different 5' regions will be investigated to test their transcription efficiency. Similarly, 
transgene RNA levels will be monitored by northerns, dot blots and primer extension relative to 

25 endogenous rbcL, 16S rRNA, or psbA. These results along with run on transcription assays 
should provide valuable information of RNA stability, processing, etc. With our past experience 
in expression of several foreign genes, foreign transcrcipts appear to be extremely stable based 
on northern blot analysis. However, a systematic study would be valuable to advance utility of 
this system by other scientists. Most importantly, the efficiency of translation will be tested in 

30 isolated chloroplasts and compared with the highly translated chloroplast protein (psbA). Pulse 
chase experiments would help assess if translational pausing, premature termination occurs. 
Evaluation of percent RNA loaded on polysomes or in constructs with or without 5TJTRs would 
help determine the efficiency of the ribosome binding site and 5 1 stem-loop translational 
enhancers. Codon optimized genes will also be compared with unmodified genes to investigate 

35 the rate of translation, pausing and termination. In our recent experience, we observed a 200-fold 
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^cumulation of foreign proteins due to decreases in proteolysis conferred by a 
putative chaperonin (De Cosa et al. 2001). Therefore, proteins from constructs expressing or not 
expressing the putative chaperonin (with or without ORF1+2) should provide valuable 
information on protein stability. Thus, all of this information will be used to improve the next 
, 5 generation of chloroplast vectors. The PI has extensive experience in analysis of chloroplast gene 
expression (Relevent publications are included in resume). 

Optimization of gene expression: We have reported that foreign genes are expressed between 
3% {crylAal) and 46% (cry2Aa2 operon) in transgenic chloroplasts (Kota et al. 1999; De Cosa 

10 et al. 2001). Several approaches will be used to enhance translation of the recombinant proteins. 
In chloroplasts, transcriptional regulation as a bottle-neck in gene expression has been overcome 
by utilizing the strong constituitive promoter of the 16s rRNA (Prrn). One advantage of Prrn is 
that it is recognized by both the chloroplast encoded RNA polymerase and the nuclear encoded 
chloroplast RNA polymerase in tobacco (Allison et al. 1996). Several investigators have utilized 

15 Prrn in their studies to overcome the initial hurdle of gene expression, transcription (De Cosa et 
al. 2001, Eibl et al. 1999, Staub et al. 2000). RNA stability appears to be one among the least 
problems because of observation of excessive accumulation of foreign transcripts, at times 
16,966-fold higher than the highly expressing nuclear transgenic plants (Lee et al. 2000). Also, 
other investigations regarding RNA stability in chloroplasts suggest that efforts for op tim izi n g 

20 gene expression need to be addressed at the post-transcriptional level (Higgs et al. 1999, Eibl et 
al. 1999). We intend to focus our investigation to address protein expression post- 
transcriptionally. For example, 5' and 3* UTRs are necessary for optimal translation and mRNA 
stability of chloroplast mRNAs (Zerges 2000). Optimal ribosomal binding sites (RBS's) as well 
as a stem-loop structure located 5' adjacent to the RBS are required for efficient translation. A 

25 recent study has shown that replacement of the Shine-Delgarno (GGAGG) with the psbA 5' 
UTR downstream of the 16S rRNA promoter enhanced translation of a foreign gene (GUS) 
hundred-fold (Eibl et al. 1999). Therefore, the 200-bp tobacco chloroplast DNA fragment (1680- 
1480) containing 5' psbA UTR will be used. This PCR product will be inserted downstream of 
the 16S rRNA promoter to enhance translation of the recombinant proteins. 

30 

Yet another approach for enhancement of translation would be to optimize codon 
compositions. We have compared A+T% content of all foreign genes that had been expressed in 
transgenic chloroplasts in our laboratory with the percentage of chloroplast expression. We 
found that higher levels of A+T always correlated with high expression levels (see table 2). It is 
35 also potentially possible to modify chloroplast protease recognition sites while modifying 
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it affecting their biological functions. Therefore, optimizing codon compositions 
of insulin and polymer genes to match the psbA gene should enhance the level of translation. 
Although rbcL (RuBisCO) is the most abundant protein on earth, it is not translated as highly as 
the psbA gene due to the extremely high turnover of the psbA gene product The psbA gene is 
5 under stronger selection for increased translation efficiency and is the most abundant thylakoid 
protein. In addition, the codon usage in higher plant chloroplasts is biased towards the NNC 
codon of 2-fold degenerate groups (i.e. TTC over TTT, GAC over GAT, CAC over CAT, AAC 
over AAT, ATC over ATT, ATA etc.). This is in addition to a strong bias towards T at third 
position of 4-fold degenerate groups. There is also a context effect that should be taken into 

10 consideration while modifying specific codons. The 2-fold degenerate sites immediately 
upstream from a GNN codon do not show this bias towards NNC. (TTT GGA is preferred to 
TTC GGA while TTC CGT is preferred to TTT CGT, TTC AGT to TTT AGT and TTC TCT to 
TTT TCT, Morton, 1993; Morton and Bemadette, 2000). In addition, highly expressed 
chloroplast genes use GNN more frequently that other genes. The web site 

1 5 http://www.kazusa.or. ip/codon and http://www nrbi.nlTn.nili.gov will be used to optimize codon 
composition by comparing codon usage of different plant species' genomes and PsbA's genes. 
Abundance of amino acids in chloroplasts and tRNA anticodons present in chloroplast will be 
taken into consideration. Optimization of polymer and proinsulin will be done using a novel PCR 
approach (Prodromou and Pearl, 1992; Casimiro et al. 1997), which has been successfully used 

20 in our laboratory to optimize codon composition of other human proteins. 

Vector constructions: For all the constructs pLD vector will be used. This vector was developed 
in this laboratory for chloroplast transformation. It contains the 16S rRNA promoter (Prrn) 
driving the selectable marker gene aadA (aminoglycoside adenyl transferase conferring 

25 resistance to spectinomycin) followed by the multiple cloning site and then the psbA 3' region 
(the terminator from a gene coding for photosystem II reaction center components) from the 
tobacco chloroplast genome. The pLD vector is a universal chloroplast expression /integration 
vector and can be used to transform chloroplast genomes of several other plant species (Daniell 
et al. 1998, Daniell 1999) because these flanking sequences are highly conserved among higher 

30 plants. The universal vector uses trnA and trnl genes (chloroplast transfer RNAs coding for 
Alanine and Isoleucine) from the inverted repeat region of the tobacco chloroplast genome as 
flanking sequences for homologous recombination. Because the universal vector integrates 
foreign genes within the Inverted Repeat region of the chloroplast genome, it should double the 
copy number of the transgene (from 5000 to 10,000 copies per cell in tobacco). Furthermore, it 

35 has been demonstrated that homoplasmy is achieved even in the first round of selection in 
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biy because of the presence of a chloroplast origin of replication within the 
flanking sequence in the universal vector (thereby providing more templates for integration). 
These, and several other reasons, foreign gene expression was shown to be much higher when 
the universal vector was used instead of the tobacco specific vector (Guda et al. 2000), 

CTB-Proinsulin Vector Construction: The chloroplast expression vector pLD-CTB-Proins will 
be constructed as follows. First, both proinsulin and cholera toxin B-subunit genes were 
amplified from suitable DNA using primer sequences. Primer 1 will contain the GGAGG 
chloroplast preferred ribosome binding site five nucleotides upstream of the start codon (ATG) 
for the CTB gene and a suitable restriction enzyme site (Spel) for insertion into the chloroplast 
vector. Primer 2 will eliminate the stop codon and add the first two amino acids of a flexible 
hinge tetrapeptide GPGP as reported by Bergerot et al. (1997), in order to facilitate folding of the 
CTB -proinsulin fusion protein. Primer 3 will add the remaining two amino acids for the hinge 
tetra-peptide and eliminate the pre-sequence of the native pre-proinsulln. Primer 4 will add a 
suitable restriction site (Spel) for subcloning into the chloroplast vector. Amplified PCR 
products will be inserted into the TA cloning vector. Both the CTB and proinsulin PCR 
fragments will be excised at the Smal and Xbal restriction sites. Eluted fragments will be ligated 
into the TA cloning vector. The CTB-proinsulin fragment will be excised at the EcoRI sites and 
inserted into EcoRI digested dephosphorolated pLD vector. 

We will design the following vectors to optimize protein expression, purification and production 
Chloroplast Genome lisSyl trnl *rnA 
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lcco plants, Eibl (1999) demonstrated, in vivo, the differences in translation 
efficiency and mRNA stability of a GUS reporter gene due to various 5' and 3' untranslated 
regions (UTR's). We intend to implement this already described systematic transcription and 
translation analysis in our practical endeavor of insulin production. Consistent with EibPs 

5 (1999) data for increased translation efficiency and mRNA stability, we will use the psbA 5 s 

UTR in addition with the psbA 3 S UTR already in use. The 200 bp tobacco chloroplast DNA 
fragment containing 5* psbA UTR will be amplified by PCR using tobacco chloroplast DNA 
as template. This fragment will be cloned directly in the pLD vector multiple cloning site 
downstream of the promoter and the aadA gene. The cloned sequence will be exactly the 

0 same as in the psbA gene. 

b) Another approach of protein production in chloroplasts involves potential insulin 
crystallization for facilitating purification. The cry2Aa2 Bacillus thuringiensis operon 
derived putative chaperonin will be used, Expression of the cry2Aa2 operon in chloroplasts 

5 provides a model system for hyper-expression of foreign proteins (46% of total soluble 
protein) in a folded configuration enhancing their stability and facilitating purification (De 
Cosa et al. 2001). This justifies inclusion of the putative chaperonin from the cry2Aa2 operon 
in one of the newly designed constructs. In this region there are two open reading frames 
(ORF1 and ORF2) and a ribosomal binding site (rbs). This sequence contains elements 

0 necessary for CrylAal crystallization, which may help to crystallize insulin and aid in 
subsequent purification. Successful crystallization of other proteins using this putative 
chaperonin has been demonstrated (Ge et al. 1998), We will amplify the ORF1 and ORF2 of 
the Bt Cry2Aa2 operon by PCR using the complete operon as template. Subsequent cloning, 
using a novel PCR technique, will allow for direct fusion of this sequence immediately 

5 upstream of the proinsulin fusion protein without altering the nucleotide sequence, which is 
normally necessary to provide a restriction enzyme site (Horton et al. 1988). 



c) To address codon optimization the proinsulin gene will be subject to a certain modifications 
in subsequent constructs. The plastid modified proinsulin (PtPris) will have its nucleotide 
sequence modified such that the codons are optimized for plastid expression, yet its amino 
acid sequence will remain identical to human proinsulin. PtPris is an ideal substitute for 
human proinsulin in the CTB fusion peptide. We intend to compare the expression of this 
construct to the native human proinsulin to determine the affects to codon optimization, 
which will serve to address, in a case study format, one relevant mechanistic parameter of 
translation. Analysis of human proinsulin gene showed that 48 of its 87 codons were the 
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[uency codons in the chloroplast for the amino acid for which they encode. For 
example, there are six different codons for leucine. Their frequency within the chloroplast 
genome ranges from 7.3 to 30.8 per thousand codons. There are 12 leucines in proinsulin, 8 
have the lowest frequency codons (7.3), and none code for the highest frequency codons 
(30.8). In the plastid optimized proinsulin gene all the codons will code for the most frequent, 
whereas in human proinsulin over half of the codons are the least frequent. Human proinsulin 
nucleotide sequence contains 62% C-Kj, whereas plastid optimized proinsulin gene will 
contain 24% C+G. Generally, lower C+G content of foreign genes correlates with higher 
levels of expression (Table 2). 



10 



d) Another version of the proinsulin gene, mini -proinsulin (Mpris), will also have its codons 
optimized for plastid expression, and its amino acid sequence will not differ from human 
proinsulin (Pris). Pris' sequence is B Chain-RR-C Chain-KR-A Chain, whereas MPris 9 
sequence is B Chain-KR-A Chain. The MPris sequence excludes the RR-C Chain, which is 

15 normally excised in proinsulin maturation to insulin. The C chain of proinsulin is an 

unnecessary part of in vitro production of insulin. Proinsulin folds properly and forms the 
appropriate disulfide bonds in the absence of the C chain. The remaining KR motif that exists 
between the B chain and the A chain in MPris allows for mature insulin production upon 
cleavage with trypsin and carboxypeptidase B. This construct will be used for our proposed 

20 biopolymer fusion protein. It's codon optimization and amino acid sequence is ideal for 

mature insulin production. 



e) Our current human proinsulin-biopolymer fusion protein contains a factor Xa proteolytic cut 
site, which serves as a cleavage point between the biopolymer and the proinsulin. Currently, 

25 cleavage of the polymer-proinsulin fusion protein with the factor Xa has been inefficient in 

our hands. Therefore, we will replace this cut site with a trypsin cut site. This will eliminate 
the need for the expensive factor Xa in processing proinsulin. Since proinsulin is currently 
processed by trypsin in the formation of mature insulin, insulin maturation and fusion peptide 
cleavage can be achieved in a single step with trypsin and carboxypeptidase B. 

30 

f) We observed incomplete translation products in plastids when we expressed the 120mer gene 
(Guda et al. 2000). Therefore, while expressing the polymer-proinsulin fusion protein, we 
have decreased the length of the polymer protein to 40mer, without losing the thermal 
responsive property. In addition, optimal codons for glycine (GGT) and valine (GTA), which 

35 constitute 80% of the total amino acids of the polymer, have been used. In all nuclear 
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mes, glycine makes up 147/1000 amino acids while in tobacco chloroplasts it is 
129/1000. Highly expressing genes like psbA and rbcL of tobacco make up 192 and 190 
gly/1000. Therefore, glycine may not be a limiting factor. Nuclear genes use 52/1000 proline 
as opposed to 42/1000 in chloroplasts. However, currently used codon for proline (CCG) 
5 could be modified to CCA or CCT to further enhance translation. It is known that pathways 

for proline and valine are compartmentalized in chloroplasts (Guda et al. 2000). Also, proline 
is known to accumulate in chloroplasts as an osmoprotectant (Daniell et al. 1994). 

g) Codon comparison of the CTB gene with psbA, showed 47% homology with the most 
1 0 frequent codons of the psbA gene. Codon analysis showed that 34% of the codons of CTB 
are complimentary to the tRNA population in the chloroplasts in comparison with 51% of 
psbA codons that are complimentary to the chloropiast tRNA population. Because of the high 
levels of CTB expression in transgenic chloroplasts (Henriques and Daniell, 2000), there will 
be no need to modify the CTB gene. 

15 

DNA sequence of all constructs will be determined to confirm the correct orientation of 
genes, in frame fusion, and accurate sequences in the recombinant DNA constructs. DNA 
sequencing will be done using a Perkin Elmer ABI prism 373 DNA sequencing system using a 
ABI Prism Dye Termination Cycle Sequencing kit. By using primers for each strand, insertion 
20 sites at both ends will be sequenced. 

Because of the similarity of protein synthetic machinery (Brixey et al. 1997), expression 
of all chloropiast vectors will be first tested in E.coli before their use in tobacco transformation. 
For Escherichia coli expression XL-1 Blue strain was used. E. coli will be transformed by 
25 standard CaCl 2 method. 

Bombardment and Regeneration of Chloropiast Transgenic Plants: Tobacco (Nicotiana 
tabacum var. Petit Havana) and nicotine free edible tobacco (LAMD 605, gift from Dr. Keith 
Wyco££ Planet Biotechnology) plants will be grown aseptically by germination of seeds on MSO 
30 medium (Daniell 1993). Fully expanded, dark green leaves of about two month old plants will be 
used for bombardment. 

Leaves will be placed abaxial side up on a Whatman No. 1 filter paper laying on the 
RMOP medium (Daniell, 1993)'in standard petri plates (100x15 mm) for bombardment Gold 
35 (0.6 \sm) microprojectiles will be coated with plasrnid DNA (chloropiast vectors) and 
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will be carried out with the biolistic device PDSIOOO/He (Bio-Rad) as described 
by Daniell (1997). Following bombardment, petri plates will be sealed with parafilm and 
incubated at 24°C under 12 h photoperiod. Two days after bombardment, leaves will be chopped 
into small pieces of ~5 mm 2 in size and placed on the selection medium (RMOP containing 500 

5 jag/ml of spectinomycin dihydrochloride) with abaxial side touching the medium in deep 
(100x25 mm) petri plates (-10 pieces per plate). The regenerated spectinomycin resistant shoots 
will be chopped into small pieces (~2mm 2 ) and subcloned into fresh deep petri plates (~5 pieces 
per plate) containing the same selection medium. Resistant shoots from the second culture cycle 
will be transferred to the rooting medium (MSO medium supplemented with IB A, 1 mg/liter and 

10 spectinomycin dihydrochloride, 500 mg/liter). Rooted plants will be transferred to soil and 
grown at 26°C under continuous lighting conditions for further analysis. 

Polymerase Chain Reaction: PCR will be done using DNA isolated from control and transgenic 
plants in order to distinguish a) true chloroplast transformants from mutants and b) cbloroplast 

1 5 transformants from nuclear transformants. Primers for testing the presence of the aadA gene (that 
confers spectinomycin resistance) in transgenic plants will be landed on the aadA coding 
sequence and 16S rRNA gene (primers 1P&1M,). In order to test chloroplast integration of the 
insulin gene, one primer will land on the aadA gene while another will land on the native 
chloroplast genome (primers 3P&3M). No PCR product will be obtained with nuclear transgenic 

20 plants using this set of primers. The primer set (2P & 2M) will be used to test integration of the 
entire gene cassette without any internal deletion or looping out during homologous 
recombination, by landing on the respective recombination sites. A Similar strategy has been 
used successfully by us to confirm chloroplast integration of foreign genes (Daniell et al., 1998; 
Koto et al., 1999; Guda et al., 2000). This screening is essential to eliminate mutants and nuclear 

25 transformants. In order to conduct PCR analyses in transgenic plants, total DNA from 
unbombarded and transgenic plants will be isolated as described by Edwards et aL (1991). 
Chloroplast transgenic plants containing the proinsulin gene will be moved to second round of 
selection in order to achieve homoplasmy. 

30 Southern Blot Analysis: Southern blots will be done to determine the copy number of the 
introduced foreign gene per cell as well as to test homoplasmy. There are several thousand 
copies of the chloroplast genome present in each plant cell. Therefore, when foreign genes are 
inserted into the chloroplast genome, it is possible that some of the chloroplast genomes have 
foreign genes integrated while others remain as the wild type (heteroplasmy). Therefore, in order 

35 to ensure that only the transformed genome exists in cells of transgenic plants (homoplasmy), the 
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:ss will be continued. In order to confirm that the wild type genome does not exist 
at the end of the selection cycle, total DNA from transgenic plants should be probed with the 
chloroplast border (flanking) sequences (the tml-trnA fragment. Figure 2A,3B). If wild type 
genomes are present (heteroplasmy), the native fragment size will be observed along with 
5 transformed genomes. Presence of a large fragment (due to insertion of foreign genes within the 
flanking sequences) and absence of the native small fragment should confirm homoplasmy 
. (Daniell et al., 1998; Kota et aL, 1999; Guda et al, 2000). 

The copy number of the integrated gene will be determined by establishing homoplasmy 
10 for the transgenic chloroplast genome. Tobacco Chloroplasts contain 5000-10,000 copies of 
their genome per cell (Daniell et al. 1998). If only a traction of the genomes are actually 
transformed, the copy number, by default, must be less than 10,000. By establishing that in the 
transgenics the insulin inserted transformed genome is the only one present, one could establish 
that the copy number is 5000-10,000 per cell. This is usually done by digesting the total DNA 
15 with a suitable restriction enzyme and probing with the flanking sequences that enable 
homologous recombination into the chloroplast genome. The native fragment present in the 
control should be absent in the transgenics. The absence of native fragment proves that only the 
transgenic chloroplast genome is present in the cell and there is no native, untransformed, 
chloroplast genome, without the insulin gene present. This establishes the homoplasmic nature of 
20 our transformants, simultaneously providing us with an estimate of 5000-10,000 copies of the 
foreign genes per cell. 

Northern Blot Analysis: Northern blots will be done to test the efficiency of transcription of the 
proinsulin gene fused with CTB or polymer genes. Total RNA will be isolated from 150 mg of 

25 frozen leaves by using the "Rneasy Plant Total RNA Isolation Kit" (Qiagen Inc., Chatsworth, 
CA). RNA (10-40 p,g) will be denatured by formaldehyde treatment, separated on a 1.2% 
agarose gel in the presence of formaldehyde and transferred to a nitrocellulose membrane (MSI) 
as described in Sambrook et al. (1989). Probe DNA (proinsulin gene coding region) will be 
labeled by the random-primed method (Promega) with 32 P-dCTP isotope. The blot will be pre- 

30 hybridized, hybridized and washed as described above for southern blot analysis. Transcript 
levels will be quantified by the Molecular Analyst Program using the GS-700 Imaging 
Densitometer (Bio-Rad, Hercules, CA), 

Polymer-insulin fusion protein purification, quantitation and characterization: 
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ler insulin fusion proteins exhibit inverse temperature transition properties (Figure 
9 and 10), they will be purified from transgenic plants essentially following the same method 
recently described by us for polymer purification from transgenic tobacco plants (Zhang et 
al.,1996). Polymer extraction buffer contains 50 mM Tris-HCl, pH 7.5, 1% 2-mecaptoethanol, 
5 5rnM EDTA and 2rnM PMSF and 0.8 M NaCl. The homogenate will then be centrifuged at 
10,000 g for 10 minutes (4°C), and the pellet will be discarded. The supernatant will be 
incubated at 42°C for 30 minutes and then centrifuged immediately for 3 minutes at 5,000 g 
(room temperature). If insulin is found to be sensitive to this temperature, T t will be lowered by 
increasing salt concentration (McPherson et aL, 1996). The pellet containing the insulin-polymer 

10 fusion protein will be resuspended in the extraction buffer and incubated on ice for 10 minutes. 
The mixture will be centrifuged at 12,000 g for 10 minutes (4°C). The supernatant will be 
collected and stored at -20°C. The purified polymer insulin fusion-protein will be 
electrophoresed in a SDS-PAGE gel according to Laemmli (1970) and visualized by either 
staining with 0.3 M CuCfe (Lee et al, 1987) or transferred to nitrocellulose membrane and probed 

1 5 with antiserum raised against the polymer or insulin protein as described below. Quantification 
of purified polymer proteins will be carried out by ELISA in addition to densitometry. 

After electrophoresis, proteins will be transferred to a nitrocellulose membrane 
electrophoretically in 25 mM Tris, 192 rnM glycine, 5% methanol (pH 8.3). The filter will be 

20 blocked with 2% dry milk in Tris-buffered saline for two hours at room temperature and stained 
with antiserum raised against the polymer AVGVP (kindly provided by the University of 
Alabama at Birmingham, monoclonal facility) overnight in 2% dry milk/Tris buffered saline. 
The protein bands reacting to the antibodies will be visualized using alkaline phosphatase-linked 
secondary antibody and the substrates nitroblue tetrazolium and 5-bromo-4-chloro-3 -indoly 1- 

25 phosphate (Bio-Rad). Alternatively, for insulin-polymer fusion proteins, a Mouse anti-human 
proinsulin (IgGl) monoclonal antibody will be used as a primary antibody. To detect the binding 
of the primary antibody to the recombinant proinsulin, a Goat anti-mouse IgG Horseradish 
Peroxidase Labeled monoclonal antibody (HPR) will be used. The substrate to be used for 
conjugation with HRP will be 3,3 5 , 5,5'-Tetramethylbenzidine. All products will be purchased 

30 from American Qualex Antibodies in San Clemente, CA. As a positive control, human 
recombinant proinsulin from Sigma will be used. This human recombinant proinsulin was 
expressed in E.coli by a synthetic proinsulin gene. Quantification of purified polymer fusion 
proteins will be carried out by densitometry using Scanning Analysis software (BioSoft, 
Ferguson, MO). Total protein contents will be determined by the dye-binding assay using 

35 reagents supplied in kit from Bio-Rad, with bovine serum albumin as a standard 
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Characterization of CTB expression: CTB protein levels in transgenic plant crude extract will 
be determined using quantitative ELISA assays. A standard curve will be generated using known 
concentrations of bacterial CTB. A 96-well microtiter plate loaded with lOOpi/well of bacterial 
5 CTB (concentrations in the range of 10-1000ng) will be incubated overnight at 4°C. The plate 
will be washed thrice with PBST (phosphate buffered saline containing 0.05% Tween-20). The 
background will be blocked by incubation in 1% bovine serum albumin (BSA) in PBS 
(300|xl/well) at 37°C for 2 h followed by washing 3 times with PBST. The plate will be incubated 
in a 1:8,000 dilution of rabbit anti-cholera toxin antibody (Sigma C-3062) (100|nl/well) for 2 h at 

10 37°C, followed by washing the wells three times with PBST. The plate will be incubated with a 
1:80,000 dilution of anti-rabbit IgG conjugated with alkaline phoshatase (100|xl/well) for 2 h at 
37°C and washed thrice with PBST. Then, 100 fil alkaline phosphatase substrate (Sigma Fast p- 
nitrophenyl phosphate tablet in 5 ml of water will be added and the reaction will be stopped with 
1M NaOH (50pl/well) when absorbancies in the mid-range of the titration reach about 2.0, or 

15 after 1 hour, whichever comes first. The plate will then be read at 405nm. These results will be 
used to generate a standard curve from which concentrations of plant protein can be extrapolated. 
Thus, total soluble plant protein (concentration previously determined using the Bradford assay) 
in bicarbonate buffer, pH 9.6 (15mM Na 2 Co 3 , 35mM NaHC0 3 ) will be loaded at 100 plant 
pl/well and the same procedure as above can be repeated. The absorbance values will be used to 

20 determine the ratio of CTB protein to total soluble plant protein, using the standard curve 
generated previously and the Bradford assay results. 

Inheritance of Introduced Foreign Genes: While it is unlikely that introduced DNA would 
move from the chloroplast genome to nuclear genome, it is possible that the gene could get 

25 integrated in the nuclear genome during bombardment and remain undetected in Southern 
analysis. Therefore, in initial tobacco transformants, some will be allowed to self-pollinate, 
whereas others will be used in reciprocal crosses with control tobacco (transgenics as female 
accepters and pollen donors; testing for maternal inheritance). Harvested seeds (Tl) will be 
germinated on media containing spectmomycin,. Achievement of homoplasmy and mode of 

30 inheritance can be classified by looking at germination results. Homoplasmy should be indicated 
by totally green seedlings (Daniell et al., 1998) while heteroplasmy is displayed by variegated 
leaves (lack of pigmentation, Svab & Maliga, 1993). Lack of variation in chlorophyll 
pigmentation among progeny should also underscore the absence of position effect, an artifact of 
nuclear transformation. Maternal inheritance will be demonstrated by sole transmission of 

35 introduced genes via seed generated on transgenic plants, regardless of pollen source (green 
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elective media). When transgenic pollen is used for pollination of control plants, 
resultant progeny would not contain resistance to chemical in selective media (will appear 
bleached; Svab and Maliga, 1993). Molecular analyses will confirm transmission and expression 
of introduced genes, and T2 seed will be generated from those confirmed plants by the analyses 
5 described above. 

Comparison of Current Purification with Polymer-based Purification Methods: It is 

important to compare purification methods by testing yield and purity of insulin produced in 
E.coli and tobacco. Three methods will be compared: a standard fusion protein in E. coli, 

10 polymer proinsulin fusion protein in E colt, and polymer proinsulin fusion in tobacco. Polymer 
proinsulin fusion peptide from transgenic tobacco will be purified by methodology described in 
section c) and Daniell (1997). E. coli purification will be performed as follows. One litear of each 
pLD containing bacteria will be grown in LB/ampicillin (100 |xg/ml) overnight and the fusion 
protein, either polymer-proinsulin or the control fusion protein (Cowley and Mackin 1997), will 

15 be expressed. Cells will be harvested by centrifugation at 5000 X g for 10 min at 4°C, and the 
bacterial pellets will be resuspended in 5 ml/g (wet wt Bacteria) of 100 mM Tris-HCl, pH 7.3. 
Lysozyrne will be added at a concentration of 1 mg/ml and placed on a rotating shaker at room 
temperature for 15 min. The lysate will be subjected to probe sonication for two cycles of 30 s 
on/30 s off at 4°C. Cellular debris will be removed by centrifugation at 1000 X g for 5 min at 

20 4°C. The E. coli produced proinsulin polymer fusion protein will be purified by inverse 
temperature transition properties (Daniell et al., 1997). After Factor Xa cleavage (as described in 
section c)) the proinsulin will be isolated from the polymer using inverse temperature transition 
properties (Daneill et aL, 1997) and subject to oxidative sulfitolysis as described below. 
Alternatively, the control fusion protein will be purified according to Cowley and Mackin (1997) 

25 as follows. The supernatant will be retained and centrifuged again at 27000 X g for 15 min at 
4°C to pellet the inclusion bodies. The supernatant will be discarded and the pellet resuspended 
in 1 ml/g (original wt. Bacteria) of dH 2 O a aliquoted into microcentrifuge tubes as 1 ml fractions, 
and then centrifuged at 16000 X# for 5 min at 4°C. The pellets will be individually washed with 
1 ml of 100 mM Tris-HCl, pH 8.5, 1M urea, 1-1 Triton X-100 and again washed withlOO mM 

30 Tris HC1 pH8.5, 2 M urea, 2 % Trinton X-100. The pellets will be resuspended in 1 ml of dH 2 0 
and transferred to a pre-weighed 30 ml Corex centrifuge tube. The sample will be centrifuged at 
15000 X g for 5 min at 4°C, and the pellet will be resuspended in 10 ml/g (wet wt. pellet) of 
70% formic acid. Cyanogen bromide will be added to a final concentration of 400 mM and the 
sample will be incubated at room temperature in the dark for 16 h. The reaction will be stopped 

35 by transferring the sample to a round bottom flask and removing the solvent by rotary 
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50 °C. The residue will be resuspended in 20ml/g (wet wt. pellet) of dH 2 0, shell 
frozen in a dry ice ethanol bath, and then lyophilized. The lyophilized protein will be dissolved 
in 20 ml/g (wet wt. pellet) of 500 mM Tris-HCl, pH 8.2, 7 M urea. Oxidative sulfitolysis will be 
performed by adding sodium sulfite and sodium tetrathionate to final concentrations of 100 and 
5 10 inM, respectively, and incubating at room temperature for 3 h. This reaction will be stopped 
by freezing on dry ice. 

Purification and folding of Human Proinsulin: The S-sulfonated material will be applied to a 
2 ml bed of Sephadex G-25 equilibrated in 20 mM Tris-HCl, pH 8.2, 7 M urea, and then washed 

10 with 9 vols of 7 M urea. The collected fraction will be applied to a Pharmacia Mono Q HR 5/5 
column equilibrated in 20 mM Tris HC1, pH 8.2, 7 M urea at a flow rate of 1 ml/min. A linear 
gradient leading to final concentration of 0.5 M NaCl will be used to elute the bound material. 2 
min (2 ml) fractions will be collected during the gradient, and protein concentration in each 
fraction will be determined. Purity and molecular mass of fractions will be estimated by Tricine 

15 SDS-PAGE (as shown in Fig. 2), where Tricine is used as the trailing ion to allow better 
resolution of peptides in the range of 1-1000 kDa. Appropriate fractions will be pooled and 
applied to a 1 .6 X 20 cm column of Sephadex G-25 (superfine) equilibrated in 5 mM ammonium 
acetate pH 6.8. The sample will be collected based on UV absorbance and freeze-dried. The 
partially purified ^-sulfonated material will be resuspended in 50 mM glycine/NaOH, pH 10.5 at 

20 a final concentration of 2 mg/ml. p-mercaptoethanol will be added at a ratio of 1.5 mol per mo! 
of cysteine 5-sulfonate and the sample will be stirred at 4°C in an open container for 16 h. The 
sample will be then analyzed by reversed-phase high-performance liquid chromatography (RP- 
HPLC) using a Vydac C 4 column (2.2 X 150 mm) equilibrated in 4% acetonitrile and 0. 1% TFA. 
Adsorbed peptides will be eluted with a linear gradient of increasing acetonitrile concentration 

25 (0.88% per min up to a Tr^xirrmm 0 f 48%). The remaining refolded proinsulin will be 
centrifuged at 16000 X g to remove insoluble material, and loaded onto a semi-preparative 
Vydac C4 column (10 X 250 mm). The bound material will be eluted as described above, and the 
proinsulin will be collected and lyophilized. 

30 Analysis and characterization of insulin expressed in E. ccli and Tobacco: The purified 
expressed proinsulin will be subjected to matrix-assisted laser desorption/ionization-time of 
flight (MALDI-TOF) analysis (as described by Cowley and Mackin, 1997), using proinsulin 
from Eli Lilly as both an internal and external standard To determine if the disulfide bridges 
have formed correctly naturally inside chloroplasts or by in vitro processing, a proteolytic 

35 digestion will be performed using Staphylococcus aureus protease V8. Five \i% of both the 
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nsulin and Eli Lilly's proinsulin will be lyophilized and resuspended in 50 jjI of 
250 mM NaP0 4 , pH 7.8. Protease V8 will be added at a ratio of 1:50 (w/w) in experimental 
samples and no enzyme will be added to the controls. All samples will then be incubated 
overnight at 37°C, the reactions will be stopped by freezing on dry ice, and samples will be 
5 stored at -20 °C until analyzed. The samples will be analyzed by RP-HPLC using a Vydac C 4 
column (2.2 X 150 mm) equilibrated in 4% acetonitrile and 0.1% TFA. Bound material will be 
eluted using a linear gradient of increasing acetonitrile concentration (0.88% per min up to a 
maximum of 48%). 

1 0 CTB-GM1 ganglioside binding assay: A GM1-EUSA assay will be performed as described by 
Arakawa et al (1997) to determine the affinity of plant-derived CTB for GMl-ganglioside. The 
microtiter plate will be coated with monosialoganglioside-GMl (Sigma G-7641) by incubating 
the plate with 100 pl/well of GM1 (3.0 jig/ml) in bicarbonate buffer, pH 9.6 at 4 °C overnight 
Alternatively, the wells will be coated with 100 jil/well of BSA (3.0 jig/ml) as control. The plates 

1 5 will be incubated with transformed plant total soluble protein and bacterial CTB (Sigma C-9903) 
in PBS (100 pi/well) overnight at 4 °C. The remainder of the procedure will be identical to the 
ELISA described above. 

Induction of oral tolerance: Four week old female NOD mice will be purchased from Jackson 
20 Laboratory (Bar Harbor, ME) and housed at the animal care facility located in the school of 
Biology at the University of Central Florida (UCF). The mice will be divided into three groups, 
each group consisting of ten mice. Each group will be fed one of the following nicotine free 
edible tobacco; untransformed, expressing CTB, or expressing CTB-proinsulin fusion protein. 
Beginning at 5 weeks of age, each mouse will be fed 3 g of nicotine free edible tobacco once per 
25 week until reaching 9 weeks of age (a total of five feedings). 

Antibody titer: At ten weeks of age, the serum and fecal material will be assayed for anti-CTB 
and anti-proinsulin antibody isotypes using the ELISA method described above. 

30 Assessment of diabetic symptoms in NOD mice: The incidence of diabetic symptoms will be 
compared among mice fed with control nicotine free edible tobacco that expresses CTB and 
those that express the CTB-proinsulin fusion protein. Starting at 10 weeks of age, the mice will 
be monitored on a biweekly basis with urinary glucose test strips (Clinistix and Diastix, Bayer) 
for development of diabetes. Glycosuria mice will be bled from the tail vein to check for 

35 glycemia using a glucose analyzer (Accu-Check, Boehringer Mannheim). Diabetes will be 
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lyperglycemia (>250 mg/dl) for two consecutive weeks (Ma et al. 1997). The plant 
tissue of control and transgenic plants to be fed to mice will be provided for a collaborator (Dr. 
Ali Amirkhosravi, Florida Hospital) to perform these studies. A letter of collaboration is 
provided documenting this arrangement. 

5 

Tentative Proposed Schedule 

Year I: 

a) Develop recombinant DNA vectors for enhanced translation of proinsulin as fusion protein 
1 0 with protein based polymers or CTB via chloroplast genomes of tobacco 

b) Obtain transgenic tobacco plants using the transformation vectors 

c) Assay transgenic expression of insulin-polymer fusion protein and CTB in chloroplasts using 
molecular and biochemical methods 

15 Yearll: 

d) Employ existing methods of polymer purification from transgenic leaves or develop new 
approaches for the fusion protein and estimate levels of expression 

e) Analyze genetic composition of transgenic plants (Mendelian or maternal inheritance) 

f) Large scale purification of insulin from green house grown transgenic plants and comparison 
20 of current insulin purification methods with polymer-based purification method 

Yearm 

g) Refolding and characterization (yield and purity) of proinsulin produced in E.coli and 
transgenic tobacco 

25 h) Assessment of diabetic symptoms in NOD mice fed with leaves expressing CTB-proinsulin 
fusion protein 

j) Assessment of immune response in mice fed with leaves expressing CTB 
k) Continue to characterize subsequent transgenic generations (Tl, T2, T3). 
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f. Vertebrate Animals 

Discription of proposed work: Oral tolerance and the incidence of diabetic symptoms will be 
compared among mice fed with nontransgenic tobacco (negative control), CTB expressing 
5 nicotine free edible tobacco (LAMD 605) and those that express the CTB-proinsulin fusion 
protein. Thirty, female nonobese diabetic (NOD) mice, four weeks of age, will be purchased 
from Jackson Laboratory (Bar Harbor, ME), and housed at the animal care facility located in the 
school of Biology at the University of Central Florida (UCF). 

10 Experimental groups: The mice will be divided into the following groups, each group 
consisting often mice: group 1, fed untransformed LAMD 605; group 2 S fed transgenic LAMD 
605 synthesizing CTB; and group 3, fed transgenic LAMD 605 synthesizing CTB-proinsulin 
fusion protein. Beginning at five weeks of age, each mouse will be fed 3 g of LAMD 605 once 
per week until reaching 9 weeks of age (a total of five feedings). At ten weeks, serum and fecal 

15 material will be assayed for anti-CTB and anti-insulin antibody isotyypes using ELISA as 
described above. 

The incidence of diabetic symptoms will be compared among mice fed transgenic LAMD 605 
synthesizing CTB and LAMD 605 synthesizing CTB-proinsulin fusion protein. Starting at 10 
20 weeks of age, the mice will be monitored on a biweekly basis with urinary glucose test strips 
(Clinistix and Diastix, Bayer) for development of diabetes. Glycosuric mice will be bled from the 
tail vein to check for glycemia using a glucose analyzer (Accu-Check, Boehringer Mannheim). 
Diabetes will be confirmed by hyperglycemia (>250 mg/dl) for two consecutive weeks (Ma et aL 
1997). 

25 

Investigator: The plant tissue of control and transgenic plants to be fed to mice will be provided 
for a collaborator, Ali Amirkhosravi PhD. (Florida Hospital), to perform these studies. A letter of 
collaboration is provided documenting this arrangement. Dr. Amirkhosravi's expertise for 
performing scientific investigations involving animals is demonstrated by his experience and 
30 publications provided in his resume. 

Justification of species selection: Female NOD mice have a high incidence of developing 
autoimmune diabetes after 12 weeks of age (Gaskins et al. 1992). Therefore, they are the 
appropriate model for our study for the prevention of autoimmune diabetes. According to the 
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previous experience and Arakawa et al. (1998), ten mice for each group, thirty 
mice total, will provide our study with the amount necessary for reliable data. 

Veternary care: Faiol Tomson DVM is the veterinary consultant for the UCF animal care 
5 facility. 

Discomfort, distress, pain, and injury: According to the investigator's experience, the mice 
involved in this investigation will not experience discomfort or severe symptoms, including 
severe diabetic symptoms. Furthermore, the specific diet for the mice is well tolerated. Animals 
10 will be checked regularly, and in case of any visible signs of distress or pain, animals will be 
removed from the study, however this is unlikely. 

Euthanasia: Upon completion of the study mice will be euthanized by an overdose of the 
inhalent anesthetic halothane, which is a standard method of euthanasia. This method is 
15 consistent with the recommendations of the Panel on Euthanasia of the American Veterinary 
Medical Association. 
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PRODUCTION OF HUMAN INSULIN IN TRANSGENIC TOBACCO 
Henry Daniell, Principal Investigator 

5 As the value of molecular farming is realized, investigations are being performed to 

develop plant based expression systems. Aside from being an environmentally friendly approach, 
chloroplast genetic engineering continually exceeds nuclear genetic engineering in total 
production of foreign proteins in plants. A recent publication from our lab (featured on the cover 
of Nature Biotechnology, January 2001) demonstrated foreign gene expression up to 46% of the 

10 total soluble protein in chloroplast transgenic plants. However, the full potential of this 
technology is yet to be realized for biopharmaceutical production. Our proposed investigation 
entails both maximizing proinsulin production in chloroplast transgenic plants and evaluating 
proposed modifications for future chloroplast foreign gene expression. 

15 Vector construction to synthesize the Cholera toxin B (CTB) subunit fused to native 

human proinsulin was first completed. As described on page 36 of this proposal, standard 
molecular biological techniques were used to create the sequence for the fiision protein DNA 
from individual genes encoding CTB and native human proinsulin. The DNA encoding this 
fusion protein was then sub cloned into the chloroplast transformation vector (pLD). 

20 

Although the pLD vector contains all the necessary elements for chloroplast expression 
of the CTB-proinsulin fusion protein, an important goal of this investigation is to evaluate the 
elements of translation that maximize foreign protein production. Therefore, as proposed on 
pages 35-38, we have developed additional constructs, each with a different modification, to 
25 allow for both the optimization of CTB-proinsulin gene expression and evaluation of these 
modifications. 

We have cloned the 5' untranslated region of the tobacco psbA gene including the 
promoter (5'UTR), shown in Figure 1 and as proposed on page 36-37. We performed PCR 

30 using the primers CCGTCGACGTAGAGAAGTCCGTATT and 

GCCCATGGTAAAATCTTGG TTTATTTA, which resulted in a 200 base pair product, as 
expected. We inserted this PCR product into a TA cloning vector. Since restriction enzyme sites 
were not available to subclone the 5'UTR immediately upstream of the gene coding for the CTB- 
proinsulin fusion protein, we used the "SOEing" PCR technique, described on page 37, to create 

35 the DNA sequence with the 5'UTR immediately upstream of the CTB-proinsulin gene (Figure 
2). The products of this PCR include both the 5'UTR (200bp) and the gene for CTB^roinsulin 
(600bp) as additional products as well as the desired 5'UTR CTB-proinsulin (5CP) at 800 bp. 
5CP was eluted and then inserted into the TA cloning vector where DNA sequencing was 
performed to confirm accuracy of nucleotide sequence before it was subcloned into the pLD 

40 vector. 

As discussed on page 35, chloroplast foreign gene expression correlates well with %AT 
of the gene coding sequence. The native human proinsulin sequence is 38% AT, while the newly 
synthesized chloroplast optimized proinsulin is 64% AT. We determined the optimal chloroplast 

45 coding sequence for the proinsulin (PTpris) gene by using a codon composition that is equivalent 
to the highest translated chloroplast gene, psbA. The prefered codon composition of psbA in 
tobacco is conserved within 20 vascular plant species. We have compared it to the native human 
proinsulin DNA sequence (Figure 3). Since there are too many changes for conventional 
mutagenesis, we employed the Recursive PCR method for total gene synthesis, as described on 

50 page 35; Figure 4 shows the product of this gene synthesis corresponding to the 280 bp expected 
size. 
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This product, PTpris, was then used as a template with CTB and 5'UTR to create a fusion 
of these sequences using the SOEing PCR technique described on page 37. The products of this 
reaction can be seen in figure 5. These include 5'UTR (200 bp), CTB (320 bp), Proinsulin (280 
5 bp), and CTB-Proinsulin (600 bp) as side products, and also the desired 5'UTR CTB-PTpris 
(5CPTP) at 800 bp. This was then inserted into the TA cloning vector where the sequence was 
verified before being subcloned into the pLD vector. 

Another parameter of foreign protein production to be investigated is post-translational. 

10 As discussed on page 37, the DNA for the putative chaperonin in the Bacillus thuringiensis Cry 
2A2 operon encodes a protein that could potentially fold and crystallize CTB-Proinsulin, which 
would allow it to accumulate in large quantities protected from chloroplast proteases and 
facilitate in subsequent purification. Standard molecular biology techniques were used to insert 
this DNA fragment immediately upstream of the 5'UTR of the construct containing the 

15 chloroplast optimized proinsulin. Additionally, another vector was constructed to contain only 
Shine-Dalgarno sequence (GGAGG) followed by the sequence encoding for the Cholera toxin B 
subunit and synthetic chloroplast optimized proinsulin fusion (CTB-PTpris). This construct will 
allow us to determine the value of the proinsulin sequence modification both with and without 
the 5 5 UTR. 

20 

All of the resulting vectors, containing the desired constructs, were used to transform 
both of the tobacco cultivars, Petit Havana and LAMD 605 (edible tobacco). Transformation was 
performed using the particle bombardment method, as described on page 38-39. Bombarded 
leaves are currently being regenerated into transgenic plants under spectinomycin selection. 

25 Several clones have begun to form shoots. The clones of Petit Havana bombarded with the initial 
CTB-human proinsulin construct have regenerated large enough for us to extract DNA. 
Extracted DNA was used as a template in a PCR reaction to confirm integration of the cassette 
into the chloroplast genome by homologous recombination, as described on page 39. We used 
two primers in this reaction, 3P and 3M. 3P anneals with the native chloroplast genome, while 

30 3M anneals with the gene for spectinomycin resistance, aadA. The 1600 bp product of this 
reaction is indicative of integration of the construct into the genome (Figure 6). This experiment 
demonstrated that 7 of the 11 analyzed clones were the desired chloroplast transgenic plants. 
Western blots are currently underway to confirm expression of various CTB-proinsulin fusion 
proteins in E. coli. Because of the similarity of chloroplast and E. coli protein synthetic 

35 machinery, chloroplast vectors are routinely tested in our lab before bombardment Membranes 
have been'immunoblotted with antibodies to both CTB and Proinsulin. Results demonstrate the 
presence of the desired fusion proteins. 

We eagerly await the regeneration of the remaining chloroplast transgenic plants. Our 
40 analysis of these plants will provide essential information to develop this technology for future 
biophannaceutical production. Our investigation will also establish a method for production and 
delivery of orally administered protein therapies. With adequate production of the CTB- 
proinsulin fusion protein in the edible tobacco plants, direct consumption of the plant tissue, as 
described on page 43, by NOD mice will prolong or prevent the onset of the autoimmune 
45 diabetes. 

This project has already overcome initial experimental challenges by successfully 
constructing chloroplast vectors with different regulatory regions utilizing most challenging 
recombinant DNA techniques. Both native and codon optimized synthetic proinsulin genes have 
50 been inserted into chloroplast vectors. Our laboratory is quite efficient in carrying out subsequent 
steps to take this project to a successful completion. We look forward to NIH funding of this 
proposal to make rapid progress in proposed objectives. 
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a. SPECIFIC AIMS 

Research on human proteins in the past years has revolutionized the use of these 
therapeutically valuable proteins in a variety of clinical situations. Since the demand for these 
5 proteins is expected to increase considerably in the coming years, it would be wise to ensure that 
in the future they will be available in significantly larger amounts, preferably on a cost-effective 
basis. Because most genes can be expressed in many different systems, it is essential to 
deterrnine which system offers the most advantages for the manufacture of the recombinant 
protein. The ideal expression system would be one that produces a maximum amount of safe, 

10 biologically active material at a minimum cost. The use of modified mammalian ceils with 
recombinant DNA techniques has the advantage of resulting in products which are closely 
related to those of natural origin; however, culturing of these cells is intricate and can only be 
carried out on limited scale. The use of microorganisms such as bacteria permits manufacture on 
a larger scale, but introduces the disadvantage of producing products, which differ appreciably 

15 from the products of natural origin. For example, proteins that are usually glycosylated in 
humans are not glycosylated by bacteria. Furthermore, human proteins that are expressed at high 
levels in E. coli frequently acquire an unnatural conformation, accompanied by intracellular 
precipitation due to lack of proper folding and disulfide bridges. Production of recombinant 
proteins in plants has many potential advantages for generating biopharmaceuticals relevant to 

20 clinical medicine. These include the following: (I) plant systems are more economical man 
industrial facilities using fermentation systems; (ii) technology is available for harvesting and 
processing plants/ plant products on a large scale; (iii) elimination of the purification requirement 
when the plant tissue containing the recombinant protein is used as a food (edible vaccines); (iv) 
plants can be directed to target proteins into stable, intracellular compartments as chloroplasts, or 

25 expressed directly in chloroplasts; (v) the amount of recombinant product that can be produced 
approaches industrial-scale levels; and (vi) health risks due to contamination with potential 
human pathogens/toxins are rrunimized. 

It has been estimated that one tobacco plant should be able to produce more recombinant 
30 protein than a 300-liter fermenter of E. coli. In addition, a tobacco plant produces a million 
seeds, facilitating large-scale production. Tobacco is also an ideal choice because of its relative 
ease of genetic manipulation and an impending need to explore alternate uses for this hazardous 
crop. However, with the exception of enzymes (e.g. phytase), levels of foreign proteins produced 
in nuclear transgenic plants are generally low, mostly less than 1% of the total soluble protein 
35 (1), May et aL (2a) discuss this problem using the following examples. Although plant derived 
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epatitis B surface antigen was as effective as a commercial recombinant vaccine, 
the levels of expression in transgenic tobacco were low (0.0066% of total soluble protein). Even 
though Norwalk virus capsid protein expressed in potatoes caused oral immunization when 
consumed as food (edible vaccine), expression levels were low (0.3% of total soluble protein). In 
5 particular, expression of human proteins in nuclear transgenic plants has been disappointingly 
low: e.g. human mterferon-D 0.000017% of fresh weight, human serum albumin 0.02% and 
erythropoietin 0.0026% of total soluble protein (see tablel in refl). A synthetic gene coding for 
the hmnfl" epidermal growth factor was expressed only up to 0.001% of total soluble protein in 
transgenic tobacco (2a). The cost of producing recombinant proteins in alfalfa leaves was 

10 estimated to be 12-fold lower than in potato tubers and comparable with seeds (1). However, 
tobacco leaves are much larger and have much higher biomass than alfalfa. The cost of 
production of recombinant proteins will be 50-fold lower than mat of E.coli fermentation (with 
20% expression levels, 1). A decrease in insulin expression from 20% to 5% of biomass doubled 
the cost of production (2b). Expression level less than 1% of total soluble protein in plants has 

15 been found to be not commercially feasible (1). Therefore, it is important to increase levels of 
expression of recombinant proteins in plants in order to exploit plant production of 
pharmacologically important proteins. 

An alternate approach is to express foreign proteins in chloroplasts of higher plants. We 

20 have recently integrated foreign genes (up to 10,000 copies per cell) into the tobacco chloroplast 
genome resulting in accumulation of recombinant proteins up to 47% of the total cellular protein 
(3). Chloroplast transformation utilizes two flanking sequences that, through homologous 
recombination, insert foreign DNA into the spacer region between the functional genes of the 
chloroplast genome, thus targeting the foreign genes to a precise location. This eliminates the 

25 ''position effect" and gene silencing frequently observed in nuclear transgenic plants. 
Chloroplast genetic engineering is an environmentally friendly approach, minimizing concerns 
of out-cross of introduced traits via pollen to weeds or other crops. Also, the concerns of insects 
developing resistance to biopesticides are minimized by hyper-expression of single insecticidal 
proteins (high dosage) or expression of different types of insecticides in a single transformation 

30 event (gene pyramiding). Concerns of insecticidal proteins on non-target insects are m i nimi zed 
by lack of expression in transgenic pollen. Most importantly, a significant advantage in the 
production of pharmaceutical proteins in chloroplasts is their ability to process eukaryotic 
proteins, including folding and formation of disulfide bridges (4). Chaperonin proteins are 
present in chloroplasts (5,6) mat function in folding and assembly of prokaryotic/eukaryotic 

35 proteins. Also, proteins are activated by disulfide bond oxido/reduction cycles using the 
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oredoxin system (7) or chloroplast protein disulfide isomerase (8). Accumulation 
of fully assembled, disulfide bonded form of human somatotropin via chloroplast transformation 
(9) and oligomeric form of CTB (10) and assembly of heavy and light chains of humanized 
Guy's 13 antibody in transgenic chloroplasts (11) provide strong evidence for successful 
5 processing of pharmaceutical proteins inside chloroplasts. Such folding and assembly should 
eliminate the need for highly expensive in vitro processing of pharmaceutical proteins. For 
example, 60% of the total operating cost in the production of human insulin is associated with in 
vitro processing (formation of disufide bridges and cleavage of me1hionine)(2b). 

10 Taken together, low levels of expression of human proteins in nuclear transgenic plants, 

and difficulty in folding, assembly/processing of human proteins in E.coli should make 
chloroplasts an ideal compartment for expression of these proteins; production of human proteins 
in transgenic chloroplasts should also dramatically lower the production cost. Large-scale 
production of these proteins in plants should be a powerful approach to provide treatment to 

1 5 patients at an affordable cost and provide tobacco farmers alternate uses for this hazardous crop. 
Therefore, we propose here expression of therapeutic proteins in transgenic tobacco chloroplasts 
to increase levels of expression and accomplish in vivo processing. 



Objectives 

20 a) Develop recombinant DNA vectors for enhanced expression of Human Serum Albumin, 
Insulin like growth factor I and Interferon- □ 2 and 5, via chloroplast genomes of tobacco 

b) Optimize processing and purification of pharmaceutical proteins using chloroplast vectors in 
& coli 

c) Obtain transgenic tobacco plants 

25 d) Characterize transgenic expression of proteins or fusion proteins using molecular and 
biochemical methods in chloroplasts 

e) Employ existing or modified methods of purification from transgenic leaves 

f) Analyze Mendelian or maternal inheritance of transgenic plants 

g) Large scale purification of therapeutic proteins from transgenic tobacco and comparison of 
30 current purification methods in E.coli or yeast 

h) Compare natural refolding in chloroplasts with existing in vitro processing methods 

i) Comparison/characterization (yield and purity) of therapeutic proteins produced in yeast or 
E.coIi with transgenic tobacco chloroplasts 

j) In vitro and in vivo (preclinical trials) studies of protein biofunctionality. 

35 
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b- BACKGROUND AND SIGNIFICANCE 

HUMAN SERUM ALBUMIN 

HSA is a monomelic globular protein and consists of a single, generally nonglycosylated, 
5 polypeptide chain of 585 amino acids (66.5 KDa and 17 disulfide bonds) with no 
postradiational modifications. It is composed of three structurally similar globular domains and 
the disulfides are positioned in repeated series of nine loop-link-loop structures centered around 
eight sequential Cys-Cys pairs. HSA is initially synthesized as pre-pro-albumin by the liver and 
released from the endoplasmatic reticulum after removal of the aminoterminal prepeptide of 18 

10 amino acids. The pro-albumin is further processed in the Golgi complex where the other 6 
aminoterminal residues of the propeptide are cleaved by a serine proteinase (12). This results in 
the secretion of the mature polypeptide of 585 amino acids. HSA is encoded by two codominant 
autosomic allelic genes. HSA belongs to the multigene family of proteins that include alpha* 
fetoprotein and human group-specific component (Gc) or vitamin D-binding family. HSA 

15 facilitates transfer of many ligands across organ circulatory interfaces such as in the liver, 
intestine, kidney and brain. In addition to blood plasma, serum albumin is also found in tissues. 
HSA accounts for about 60% of the total protein in blood serum. In the serum of human adults, 
the concentration of albumin is 40 mg/ml. 

20 Medical applications: The primary function of HSA is the maintenance of colloid osmotic 
pressure (COP) within the blood vessels. Its abundance makes it an important determinant of the 
pharmacokinetic behavior of many drugs. Reduced synthesis of HSA can be due to advanced 
liver disease, impaired intestinal absorption of nutrients or poor nutritional intake. Increased 
albumin losses can be due to kidney diseases (increased glomerular permeability to 

25 macromolecules in the nephrotic syndrome), intestinal diseases (protein-losing enteropathies) or 
exudative skin disorders (burns). Catabolic states such as chronic infections, sepsis, surgery, 
intestinal resection, trauma or extensive burns can also cause hypoalbuminemia. HSA is used in 
therapy of blood volume disorders, for example posthaemorrhagic acute hypovolemia or 
extensive bums, treatment of dehydration states, and also for cirrhotic and hepatic illnesses. It is 

30 also used as an additive in perfusion liquid for extracorporeal circulation. HSA is used clinically 
for replacing blood volume, but also has a variety of non-therapeutic uses, including its role as a 
stabilizer in formulations for other therapeutic proteins. HSA is a stabilizer for biological 
materials in nature and is used for preparing biological standards and reference materials. 
Furthermore, HSA is frequently used as an experimental antigen, a cell-culture constituent and a 

35 standard in clinical-chemistry tests. 
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Expression systems: The expression and purification of recombinant HSA from various 
microorganisms has been reported previously (13-17). Saccharomyces cerevisiae has been used 
to produce HSA both intracellulary, requiring denaturation and refolding prior to analysis (18), 
5 and by secretion (19). Secreted HSA was equivalent structurally, but the recombinant product 
had lower levels of expression (recovery) and structural heterogeneity compared to the blood 
derived protein (20). HSA was also expressed in Kluyveromyces lactis, a yeast with good 
secretary properties achieving 1 gfliter in fed batch cultures (21). Ohtani et al (22) developed a 
HSA expression system using Pichia pastoris and established a purification method obtaining 

1 0 recombinant protein with similar levels of purity and properties as the human protein. In Bacillus 
subtilis, HSA could be secreted using bacterial signal peptides (15). HSA production in E. coli 
was successful but required additional in vitro processing with trypsin to yield the mature protein 
(14). Sijmons et al. (23) expressed HSA in transgenic potato and tobacco plants. Fusion of HSA 
to the plant PR-S presequence resulted in cleavage of the presequence at its natural site and 

15 secretion of correctly processed HSA, that was indistinguishable from the authentic human 
protein. The expression was 0.014% of the total soluble protein. However, none of these 
methods have been exploited commercially. 

Challenges in commercial production: Albumin is currently obtained by protein fractionation 

20 from plasma and is the world's most used intravenous protein, estimated at around 500 metric 
tons per year. Albumin is administered by intravenous injection of solutions containing 20% of 
albumin. The average dosage of albumin for each patient varies between 20-40 grams/day. The 
consumption of albumin is around 700 kilograms per million habitants per year. In addition to 
the high cost, HSA has the risk of transmitting diseases as with other blood-derivative products. 

25 The price of albumin is about $3.7/g. Thus, the market of this protein approximately amounts to 
$ 2,600,000 per million people per year (0.7 billion dollars per year in USA). Because of the 
high cost of albumin, synthetic macromolecules (like dextrans) axe used to increase plasma 
colloidosmotic pressure. 

Commercial HSA is mainly prepared from human plasma. This source, hardly meets the 

30 requirements of the world market The availability of human plasma is limited and careful heat 
treatment of the product prepared must be performed to avoid potential contamination of the 
product by hepatitis, HIV and other viruses. The costs of HSA extraction from blood are very 
high. In order to meet the demands of the large albumin market with a safe product at a low cost, 
innovative production systems are needed. Plant biotechnology offers promise of obtaining safe 

35 and cheap proteins to be used to treat human diseases. 
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INTERFERON ALPHA 

Interferons (IFNs) constitute a heterogeneous family of cytokines with antiviral, 
antigrowth, and immunomodulatory properties (24-26). Type I IFNs are acid-stable and 
5 constitute the first line of defence against viruses, both by displaying direct antiviral effects and 
by interacting with the cytokine cascade and the immune system. Their function is to induce 
regulation of growth and differentiation of T cells. The human IFN-a family consists of at least 
22 intronless genes, 9 of which are pseudogenes and 13 expressed genes (subtypes) (27). Human 
IFN-a genes encode proteins of 188 or 189 amino acids. The first 23 amino acids constitute a 

10 signal peptide, and the other 165 or 166 amino acids form the mature protein. IFN-a subtypes 
show 78-94 % homology at the nucleotide level Presence of two disulfide bonds between Cys- 
l:Cys-99 and Cys-29:Cysl39 is conserved among all IFN-a species (28). Human IFN-a genes 
are expressed constitutively in organs of normal individuals (29,30). Individual IFN-a genes are 
differently expressed depending on the stimulus and they show restricted cell type expression 

15 (31). Although all IFN-a subtypes bind to a common receptor (32), several reports suggest that 
they show quantitatively distinct patterns of antiviral, growth inhibitory and immunomodulatory 
activities (33). BFN-a8 and IFN-cc5 seem to have the greatest antiviral activity in liver tumour 
cells HuH7 (33). IFN-a5 has, at least, the same antiviral activity as IFN-a2 in in vitro 
experiments (unpublished data in Dr. Prieto's lab). It has been shown recently that IFN-a5 is the 

20 sole IFN-a subtype expressed in normal liver tissue (34). IFN-a5 expression in patients with 
chronic hepatitis C is reduced in the liver (34) and induced in mononuclear cells (35). 

Medical applications: Interferons are mainly known for their antiviral activities against a wide 
spectrum of viruses but also for their protective role against some non-viral pathogens. They are 

25 potent immunomodulators, possess direct antiproliferative activities and are cytotoxic or 
cytostatic for a number of different tumour cell types. IFN-a is mainly employed as a standard 
therapy for hairy cell leukaemia, metastasizing carcinoma and AIDS-associated angiogenic 
tumours of mixed cellularity known as kaposi sarcomas. It is also active against a number of 
other tumours and viral infections. For example, it is the current approved therapy for chronic 

30 viral hepatitis B (CHB) and C (CHC). The IFN-a subtype used for chronic viral hepatitis is IFN- 
a2. About 40% of patients with CHB and about 25% of patients with CHC respond to this 
therapy with sustained viral clearance. The usual doses of IFN-a are 5-10 MU (subcutaneous 
injection) three days per week for 4-6 months for CHB and 3 MU three days per week for 12 
months for CHC. Three MU of IFNoc2 represent approximately 15 Dg of recombinant protein. 
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ate in patients with chronic hepatitis C can be increased by combining IFN-a2 and 
ribavirin. This combination therapy, which considerably increases the cost of the therapy and 
causes some additional side effects, results in sustained biochemical and virological remission in 
about 40-50% of cases. Recent data suggest that pegilated interferon in weekly doses of 180 Dg 
5 can also increase the sustained response rate to about 40%. IFN-a5 is the only IFN-a subtype 
expressed in liver; this expression is reduced in patients with CHC and IFN-a5 seems to have 
one of the highest antiviral activity in liver tumour cells (see above). An international patent to 
use IFN-a5 has been filed by Prieto's group to facilitate commercial development (36). 

10 Expression systems: Human interferons are currently prepared in microbial systems via 
recombinant DNA technology in amounts which cannot be isolated from natural sources 
(leukocytes, fibroblasts, lymphocytes). Different recombinant interferon- □ genes have been 
cloned and expressed in E. coli (37a,b) or yeast (38) by several groups. Generally, the 
synthesized protein is not correctly folded due to the lack of disulfide bridges and therefore, it 

1 5 remains insoluble in inclusion bodies that need to be solubilized and refolded to obtain the active 
interferon (39,40). One of the most efficient methods of interferon-D expression has been 
published recently by Babu et al. (41). In this method, E. coli cells transformed with interferon 
vectors (regulated by temperature inducible promoters) were grown in high cell density cultures; 
this resulted in the production of 4 g interferon-D/liter of culture. Expression resulted 

20 exclusively in the form of insoluble inclusion bodies which were solubilized under denaturing 
conditions, refolded and purified to near homogeneity. The yield of purified interferon-D was 
approximately 300mg/l of culture. Expression in plants via the nuclear genome has not been 
very successful. Srnirnov et al. (42) obtained transformed tobacco plants with Agrobacterium 
tumefaciens using the interferon-D gene under 35S CaMV promoter but the expression level was 

25 very low. Eldelbaum et al. (43) showed tobacco nuclear transformation with Interferon-D and 
the expression level detected was 0.000017% of fresh weight. 

Challenges in commercial production: The number of subjects infected with hepatitis C virus 
(HCV) is estimated to be 120 million (5 million in Europe and 4 million in USA). Seventy per 

30 cent of the infected people have abnormal liver function and about one third of these have severe 
viral hepatitis or cirrhosis. It might be estimated however that there are about 10,000-15,000 
cases of chronic infection with hepatitis B virus (HBV) in Europe, a slightly lower number of 
cases in USA. In Asia the prevalence of chronic HCV and HBV infection is very high (about 110 
million of people are infected by HCV and about 150 millions are infected by HBV). In Africa 

35 HCV infection is very prevalent. Since unremitting chronic viral hepatitis leads to liver cirrhosis 
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' to liver cancer, the high prevalence of HBV and HCV infection in Asia and 
Africa accounts for their very high incidence of hepatocellular carcinoma. Based on these data, 
the need for IFN-a is large. IFN-oc2 is currently produced in microorganisms by a number of 
companies and the price of 3 MU (1 5 □ g) of recombinant protein in the western market is about 
5 $25. Thus, the cost of one year IFN-a2 therapy is about $ 4,000 per patient This price makes 
this product unavailable for most of the patients in the world suffering from chronic viral 
hepatitis. Clearly methods to produce less expensive recombinant proteins via plant 
biotechnology innovations would be crucial to make antiviral therapy widely available. Besides, 
if IFN-a5 is more efficient than IFN-a2, lower doses may be required. 

10 

INSULIN-LIKE GROWTH FACTOR-I (IGF-I) 

The Insulin-like Growth Factor protein, IGF-I, is an anabolic honnone with a complex 
maturation process. A single IGF-I gene is transcribed into several mRNAs by alternative 
splicing and use of different transcription initiation sites (44-46). Depending on the choice of 
1 5 splicing, two immature proteins are produced: IGF-IA, expressed in several tissues and IGF-IB, 
mostly expressed in liver (45). Both pre-proteins produce the same mature protein. A and B 
immature forms have different lengths and composition, as their termini are modified post- 
translationally by glycosylation. However, these ends are processed in the last step of 
maturation. Mature IGF-I protein is secreted, not glycosylated and has three disulfide bonds, 70 
20 amino acids and a molecular weight of 7.6 kD (47-49). Physiologically, IGF-I expression is 
induced by growth hormone (GH). Actually, the knock out of IGF-I in mice has shown that 
several functions attributed originally to GH are in fact mediated by IGF-I. GH production by 
adenohypofisis is repressed by feed-back inhibition of IGF-I. GH induces IGF-I synthesis in 
different tissues, but mostly in liver, where 90% of IGF-I is produced (48). The IGF-I receptor is 
25 expressed in different tissues. It is formed by two polypeptides: alpha that interacts with IGF-I 
and beta involved in signal transduction and also present in the insulin receptor (50,51). Thus, 
IGF-I and insulin activation are similar. 

Medical applications: IGF-I is a potent multifunctional anabolic hormone produced in the liver 
30 upon stimulation by growth hormone (GH). In liver cirrhosis the reduction of receptors for GH 
in hepatocytes and the diminished synthesis of the liver parenchyma cause a progressive fall of 
serum IGF-I levels. Patients with liver cirrhosis have a number of systemic derrangements such 
as muscle atrophy, osteopenia, hypogonadism, protein-calorie malnutrition which could be 
related to reduced levels of circulating IGF-I. Recent studies from Prieto's laboratory have 
35 demonstrated that treatments with low doses of IGF-I induce significant improvements in 
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tus (52), intestinal absorption (53-55), osteopenia (56), hypogonadism (57) and 
liver function (58) in rats with experimental liver cirrhosis. These data support that IGF-I 
deficiency plays a pathogenic role in several systemic complications occurring in liver cirrhosis. 
The liver can be considered as an endocrine gland synthesising a hormone such as IGF-I with 
5 important physiological functions. Thus liver cirrhosis should be viewed as a disease 
accompanied by a hormone deficiency syndrome for which replacement therapy with IGF-I is 
warranted. Clinical studies are in progress to ascertain the role of IGF-I in the management of 
cirrhotic patients. IGF-I is also being currently used for Laron dwarfism treatment. These 
patients lack liver GH receptor so IGF-I is not expressed (59). Also IGF-I, acting as a 
10 hypoglycemias, is given together with insulin in diabetes mellitus (60,61). Anabolic effects of 
IGF-I are used in osteoporosis treatment (62,63) hypercatabolism and starvation due to burning 
and HIV infection (64,65). Unpublished studies indicate that IGF-I could also be used in patients 
with articular degenerative disease (osteoarthritis). 

15 Expression Systems: The potency of IGF-I has encouraged a great number of scientists to try 
IGF-I expression in various microorganisms due to the small amount present in human plasma. 
Production of IGF-I in yeast was shown to have several disadvantages like low fermentation 
yields and risks of obtaining undesirable glycosylation in these molecules (66). Expression in 
bacteria has been the most successful approach, either as a secreted form fused to protein leader 

20 sequences (67) or fused to a solubilized affinity fusion protein (68). In addition, IGF-I has been 
produced as insoluble inclusion bodies fused to protective polypeptides (69). Sun-Ok Kim and 
Young Lee (70a) expressed IGF-I as a truncated beta-galactosidase fusion protein. The final 
purification yielded approximately 5 mg of IGF-I having native conformation per liter of 
bacterial culture. IGF-I has also been expressed in animals. Zinovieva et al. (70b) reported an 

25 expression of 0.543 mg/ml in rabbit milk. 

Challenges in commercial production: IGF-I circulates in plasma in a fairly high concentration 
varying between 120-400 ng/mL In cirrhotic patients the values of IGF-I fail to 20 ng/ml and 
frequently to undetectable levels. Replacement therapy with IGF-I in liver cirrhosis requires 

30 administration of 1.5-2 mg per day for each patient Thus, every cirrhotic patient will consume 
about 600 mg per year. IGF-I is currently produced in bacteria (71). The high amount of 
recombinant protein needed for IGF-I replacement therapy in patients with liver cirrhosis will 
make this treatment exceedingly expensive if new methods for cheap production of recombinant 
proteins are not developed. Besides, as described above, IGF-I is used in treatment of dwarfism, 

35 diabetes, osteoporosis, starvation and hypercatabolism. IGF-I use in osteoarthritis is currently 
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ated. Again, plant biotechnology could provide a solution to make economically 
feasible the application of IGF-I therapy to all these patients. 

CHLOROPLAST GENETIC ENGINEERING 

5 When we developed the concept of chloroplast genetic engineering (72,73), it was 

possible to introduce isolated intact chloroplasts into protoplasts and regenerate transgenic plants 
(74). Therefore, early investigations on chloroplast transformation focused on the development 
of in organello systems using intact chloroplasts capable of efficient and prolonged transcription 
and translation (75-77) and expression of foreign genes in isolated chloroplasts (78). However, 

1 0 after the discovery of the gene gun as a transformation device (79), it was possible to transform 
plant chloroplasts without the use of isolated plastids and protoplasts. Chloroplast genetic 
engineering was accomplished in several phases. Transient expression of foreign genes in 
plastids of dicots (80,81) was followed by such studies in monocots (82). Unique to the 
chloroplast genetic engineering is the development of a foreign gene expression system using 

1 5 autonomously replicating chloroplast expression vectors (80). Stable integration of a selectable 
marker gene into the tobacco chloroplast genome (83) was also accomplished using the gene 
gun. However, useful genes conferring valuable traits via chloroplast genetic engineering have 
been demonstrated only recently. For example, plants resistant to B.t. sensitive insects were 
obtained by integrating the crylAc gene into the tobacco chloroplast genome (84). Plants 

20 resistant to B.t. resistant insects (up to 40,000 fold) were obtained by hyper-expression of the 
cry2A gene within the tobacco chloroplast genome (85). Plants have also been genetically 
engineered via the chloroplast genome to confer herbicide resistance and the introduced foreign 
genes were maternally inherited, overcoming the problem of out-cross with weeds (86). 
Chloroplast genetic engineering technology is currently being applied to other useful crops 

25 (73,87). 

c. PRELIMINARY STUDIES 

A remarkable feature of chloroplast genetic engineering is the observation of 
30 exceptionally large accumulation of foreign proteins in transgenic plants, as much as 46% of 
CRY protein in total soluble protein, even in bleached old leaves (3). Stable expression of a 
pharmaceutical protein in chloroplasts was first reported for GVGVP, a protein based polymer 
with varied medical applications (such as the prevention of post-surgical adhesions and scars, 
wound coverings, artificial pericardia, tissue reconstruction and programmed drug delivery (88)). 
35 Subsequently, expression of the human somatotropin via the tobacco chloroplast genome (9) to 
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'o of total soluble protein) was observed. The following investigations that are in 
progress in the Daniell laboratory illustrate the power of this technology to express small 
peptides, entire operons, vaccines that require oligomeric proteins with stable disulfide bridges 
and monoclonals that require assembly of heavy/light chains via chaperonins. 
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20 Engineering novel pathways via the chloroplast genome: In plant and animal cells, nuclear 
mKNTAs are translated monocistronically. This poses a serious problem when engineering 
multiple genes in plants (91). Therefore, in order to express the poiyhydroxybutyrate polymer or 
Guy's 13 antibody, single genes were first introduced into individual transgenic plants, then 
these plants were back-crossed to reconstitute the entire pathway or the complete protein (92,93). 

25 Similarly, in a seven year long effort, Ye et al. (81) recently introduced a set of three genes for a 
short biosynthetic pathway that resulted in p-carotene expression in rice. In contrast, most 
chloroplast genes of higher plants are cotranscribed (91). Expression of polycistrons via the 
chloroplast genome provides a unique opportunity to express entire pathways in a single 
transformation event. We have recently used the Bacillus thuringiensis (Bt) cry2Aa2 operon as a 

30 model system to demonstrate operon expression and crystal formation via the chloroplast 
genome (3). Cry2Aa2 is the distal gene of a three-gene operon. The off immediately upstream of 
cr>;2Aa2 codes for a putative chaperonin that facilitates the folding of cry2Aa2 (and other 
proteins) to form proteolytically stable cuboidal crystals (94). 
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>re, the cry2Aa2 bacterial operon was expressed in tobacco chloroplasts to test the 
resultant transgenic plants for increased expression and improved persistence of the accumulated 
insecticidal protein(s). Stable foreign gene integration was confirmed by PCR and Southern blot 
analysis in T 0 and Ti transgenic plants. CrylAal operon derived protein accumulated at 45.3% 
5 of the total soluble protein in mature leaves and remained stable even in old bleached leaves 
(46.1%)(Figure 1). This is the highest level of foreign gene expression ever reported in 
transgenic plants. Exceedingly difficult to control insects (10-day old cotton bollworm, 
beetarmy worm) were killed 100% after consuming transgenic leaves. Electron micrographs 
showed the presence of the insecticidal protein folded into cuboidal crystals similar in shape to 

1 0 Cry2Aa2 crystals observed in Bacillus thuringiensis (Figure 2). In contrast to currently marketed 
transgenic plants with soluble CRY proteins, folded protoxin crystals will be processed only by 
target insects that have alkaline gut pH; this approach should improve safety of Bt transgenic 
plants. Absence of insecticidal proteins in transgenic pollen eliminates toxicity to non-target 
insects via pollen. In addition to these environmentally friendly approaches, this observation 

1 5 should serve as a model system for large-scale production of foreign proteins within chloroplasts 
in a folded configuration enhancing their stability and facilitating single step purification. This is 
the first demonstration of expression of a bacterial operon in transgenic plants and opens the 
door to engineer novel pathways in plants in a single transformation event 

20 Expressing small peptides via the chloroplast genome: It is common knowledge that the 
medical community has been fighting a vigorous battle against drug resistant pathogenic bacteria 
for years. Cationic antibacterial peptides from mammals, amphibians and insects have gained 
more attention over the last decade (95). Key features of these cationic peptides are a net positive 
charge, an affinity for negatively-charged prokaryotic membrane phospholipids over neutral- 

25 charged eukaryotic membranes and the ability to form aggregates that disrupt the bacterial 
membrane (96). 

There are three major peptides with a-helical structures, cecropin from Hyalophora 
cecropia (giant silk moth), magainins from Xenopus laevis (African frog) and defensins from 
30 mammalian neutrophils. Magainin and its analogues have been studied as a broad-spectrum 
topical agent, a systemic antibiotic; a wound-healing stimulant; and an anticancer agent (97). We 
have recently observed that a synthetic lytic peptide (MSI-99, 22 amino acids) can be 
successfully expressed in tobacco chloroplast (98). The peptide retained its lytic activity against 
the phytopathogenic bacteria Pseudomonas syringae and multidrug resistant human pathogen, 
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aeruginosa. The anti-microbial peptide (AMP) used in this study was an 
amphipathic alpha-helix molecule that has an affinity for negatively charged phospholipids 
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commonly found in the outer-membrane of bacteria. Upon contact with these membranes, 
20 individual peptides aggregate to form pores in the membrane, resulting in bacterial lysis. 
Because of the concentration dependent action of the AMP, it was expressed via the chloroplast 
genome to accomplish high dose delivery at the point of infection. PCR products and Southern 
blots confirmed chloroplast integration of the foreign genes and homoplasmy. Growth and 
development of the transgenic plants was unaffected by hyper-expression of the AMP within 
25 chloroplasts. In vitro assays with To and T\ plants confirmed that the AMP was expressed at 
high levels (21.5 to 43% of the total soluble protein) and retained biological activity against 
Pseudomonas syringae, a major plant pathogen. In situ assays resulted in intense areas of 
necrosis around the point of infection in control leaves, while transformed leaves showed no 
signs of necrosis (200-800 ug of AMP at the site of infectionXFigure 3). T\ in vitro assays 
30 against Pseudomonas aeruginosa (a multi-drug resistant human pathogen) displayed a 96% 
inhibition of growth (Figure 4). These results give a new option in the battle against 
phytopathogenic and drug-resistant human pathogenic bacteria. Small peptides (like insulin) are 
degraded in most organisms. However, stability of this AMP in chloroplasts opens up this 
compartment for expression of hormones and other small peptides. 
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ch lera toxin (3 subunit oligomers as a vaccine in chloroplasts: Vibrio 
cholerae, which causes acute watery diarrhea by colonizing the small intestine and producing the 
enterotoxin, cholera toxin (CT). Cholera toxin is a hexameric AB5 protein consisting of one 
toxic 27kDa A subunit having ADP ribosyl transferase activity and a nontoxic pentamer of 1 1 .6 
5 kDa B subunits (CTB) that binds to the A subunit and facilitates its entry into the intestinal 
epithelial cells. CTB when administered orally (99) is a potent mucosal immunogen which can 
neutralize the toxicity of the CT holotoxin by preventing it from binding to the intestinal cells 
(100). This is believed to be a result of it binding to eukaryotic cell surfaces via the Gj^j 

gangliosides, receptors present on the intestinal epithelial surface, thus eliciting a mucosal 
10 immune response to pathogens (101) and enhancing the immune response when chemically 
coupled to other antigens (102-105). 

Cholera toxin (CTB) has previously been expressed in nuclear transgenic plants at levels 
of 0.01 (leaves) to 0.3% (tubers) of the total soluble protein. To increase expression levels, we 

15 engineered the chloroplast genome to express the CTB gene (10). We observed expression of 
oligomeric CTB at levels of 4-5% of total soluble plant protein (Figure 5 A). PCR and Southern 
Blot analyses confirmed stable integration of the CTB gene into the chloroplast genome. 
Western blot analysis showed that transgenic chloroplast expressed CTB was antigenically 
identical to commercially available purified CTB antigen (Figure 6). Also, GMl-g^g^oside 

20 binding assays confirm that chloroplast synthesized CTB binds to the intestinal membrane 
receptor of cholera toxin (Figure 5B). Transgenic tobacco plants were morphologically 
indistinguishable from untransformed plants and the introduced gene was found to be stably 
inherited in the subsequent generation as confirmed by PCR and Southern Blot analyses. The 
increased production of an efficient transmucosal carrier molecule and delivery system, like 

25 CTB, in chloroplasts of plants makes plant based oral vaccines and fusion proteins with CTB 
needing oral administration, a much more feasible approach. This also establishes unequivocally 
that chloroplasts are capable of forming disulfide bridges to assemble foreign proteins. 

Expression and assembly of monoclonals in transgenic chloroplasts: Dental caries (cavities) 
30 is probably the most prevalent disease of humankind. Colonization of teeth by S. mutans is the 
single most important risk factor in the development of dental caries. S. mutans is a non-motile, 
gram positive coccus. It colonizes tooth surfaces and synthesizes glucans (insoluble 
polysaccharide) and fructans from sucrose using the enzymes glucosyltransferase and 
fructosyltransferase respectively (106a). The glucans play an important role by allowing the 
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dhere to the smooth tooth surfaces. After its adherence, the bacterium ferments 
sucrose and produces lactic acid. Lactic acid dissolves the minerals of the tooth, producing a 
cavity. 

A topical monoclonal antibody therapy to prevent adherence of S. mutans to teeth has 
recently been developed. The incidence of cariogenic bacteria (in humans and animals) and 
dental caries (in animals) was dramatically reduced for periods of up to two years after the 
cessation of the antibody therapy. No adverse events were detected either in the exposed animals 
or in human volunteers (106b). The annual requirement for this antibody in the US alone may 
eventually exceed 1 metric ton. Therefore, this antibody was expressed via the chloroplast 
genome to achieve higher levels of expression and proper folding (11). The integration of 
antibody genes into the chloroplast genome was confirmed by PGR and Southern blot analysis. 
The expression of both heavy and light chains was confirmed by western blot analysis under 
reducing conditions (Figure 7A,B). The expression of fully assembled antibody was confirmed 
by western blot analysis under non-reducing conditions (Figure 7C). This is the first report of 
successful assembly of a multi-subunit human protein in transgenic chloroplasts. Production of 
monoclonal antibodies at agricultural level should reduce their cost and create new applications 
of monoclonal antibodies. 

HUMAN SERUM ALBUMIN 
20 Nuclear transformation: Recently, Dr. ! s Mingo-Castel group in Spain (a Co-PI in this proposal) 
cloned the human HSA cDNA from human liver cells and fused the patatin promoter (whose 
expression is tuber specific (107)) along with the leader sequence of PIN II (proteinase II 
inhibitor potato transit peptide that directs HSA to the apoplast (108)). Leaf discs of Desiree and 
Kennebec potato plants were transformed using Agrobacterium tumefaciens. A total of 98 
25 transgenic Desiree clones and 30 Kennebec clones were tested by PCR and western blots. 
Western blots showed that the recombinant albumin (rHSA) had been properly cleaved by the 
proteinase II inhibitor transit peptide (Figure 8). Expression levels of both cultivars were very 
different among all transgenic clones as expected (Figure 9), probably because of position effects 
and gene silencing (89,90). The population distribution was similar in both cultivars: majority of 
30 transgenic clones showed expression levels between 0.04 and 0.06% of rHSA in the total soluble 
protein. The ranrimnTin recombinant HSA amount expressed was 0.2%, Between one and five T- 
DNA insertions per tetraploid genome were observed in these clones. Plants with higher protein 
expression were always clones with several copies of the HSA gene. 'Levels of mRNA were 
analyzed by Northern blots. There was a correlation between transcript levels and recombinant 
35 albumin accumulation in transgenic tubers. The N-terminal sequence showed proper cleavage of 
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ide and the amino terminal sequence between recombinant and human HSA was 
identical. Inhibition of patatin expression using the antisense technology did not improve the 
amount of rHSA. Average expression level among 29 transgenic plants was 0.032% of total 
soluble protein, with a maximum expression of 0. 1%. 

5 

Chloroplast transformation: We have also initiated transformation of the tobacco chloroplast 
genome for hyperexpression of HSA. The codon composition is ideal for chloroplast expression 
and no changes in nucleotide sequences were necessary (see section d.3). For all the constructs 
pLD vector was used (see description in section d.4). We designed several vectors to optimize 
1 0 HSA expression. All these contain ATG as the first amino acid of the mature protein. 

1 - RB S- ATG-HS A : The first vector includes the gene that codes for the mature HSA plus an 
additional ATG as a translation initiation codon. We included the ATG in one of the primers of 
the PGR, 5 nucleotides downstream of the chloroplast preferred RBS sequence GGAGG. The 
15 cDNA sequence of the mature HSA (cloned in Dr. Mingo-Castel's laboratory) was used as a 
template. The PGR product was cloned into PGR 2.1 vector, excised as an EcoRI-NotI fragment 
and introduced into the pLD vector. 

2- 5TJTRpsbA-ATG-HSA : The 200 bp tobacco chloroplast DNA fragment containing the 5' 
psbA UTR (untranslated region; see section d.3) was amplified using PCR and tobacco DNA as 

20 template. The fragment was cloned into PCR 2.1 vector, excised EcoRI-Ncol fragment was 
inserted at the Ncol site of the ATG-HS A and finally inserted into the pLD vector as an EcoRI- 
Notl fragment downstream of the 16S rRNA promoter to enhance translation of the protein. 

3- BtORFl+2-ATG-HSA: ORF1 and ORF2 of the Bt Cry2Aa2 operon (see section c and d.3) 
were amplified in a PCR using the complete operon as a template. The fragment was cloned into 

25 PCR 2.1 vector, excised as an EcoRI-EcoRV fragment, inserted at EcoRV site with the ATG- 
HSA sequence and introduced into the pLD vector as an EcoRI-NotI fragment. The ORF1 and 
ORF2 were fused upstream of the ATG-HSA. 

Because of the similarity of protein synthetic machinery (109), expression of aU 
30 chloroplast vectors was first tested in E.coli before their use in tobacco transformation. Different 
levels of expression were obtained in E. coli depending on the construct (Figure 10). Using the 
psbA 5 ! UTR and the ORF1 and ORF2 of the cry2Aa2 operon, we obtained higher levels of 
expression than using only the RBS. We have observed in previous experiments that HSA in E. 
coli is completely insoluble (as is shown in ref 14), probably due to an improper folding resulting 
35 from the absence of disulfide bonds. This is the reason why the protein is precipitated in the gel 
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ifferent polypeptide sizes were observed, probably due to incomplete translation. 
Assuming that E. coli and chloroplast have similar protein synthesis machinery, one could expect 
different levels of expression in transgenic tobacco chloroplasts depending on the regulatory 
sequences, with the advantage that disulfide bonds are formed in chloroplasts (9). These three 
5 vectors were bombarded into tobacco leaves via particle bombardment (110) and after 4 weeks 
small shoots appeared as a result of independent transformation events. Characterization of these 
transformants is in progress. Progress report will be provided when the panel meets. 

INTERFERON-otS 

10 Interferon-oc5 has not been expressed yet as a commercial recombinant protein. The first 

attempt has been made recently in Prieto's laboratory. The IFN-a5 gene was cloned and the 
sequence of the mature protein was inserted into the pET28 vector, that included the ATG, 
histidine tag for purification and thrombin cleavage sequences. The tagged IFN-a5 was purified 
first by binding to a nickel column and biotinylated thrombin was then used to eliminate the tag 

15 on IFN-a5. Biotinylated thrombin was removed from the preparation using streptavidin agarose. 
The expression level was 5.6 micrograms per liter of broth culture and the recombinant protein 
was active in antiviral activity similar or higher than commercial IFN-a2 (Intron A, Schering 
Plouth). 

20 INSULIN-LIKE GROWTH FACTOR-I (IGF-I) 

Recent studies in Prieto's laboratory have demonstrated that treatment with low doses of 
IGF-I induced significant improvements in nutritional status (52), intestinal absorption (53-55), 
osteopenia (56), hypogonadism (57) and liver function (58) in rats with experimental liver 
cirrhosis. These data support that IGF-I deficiency plays a pathogenic role in several systemic 
25 complications occurring in liver cirrhosis. Clinical studies are in progress to ascertain the role of 
IGF-I in the management of cirrhotic patients. Unpublished studies indicate that IGF-I could also 
be used in patients with articular degenerative disease (osteoarthritis). 

d. RESEARCH DESIGN AND METHODS 

30 d.l Evaluation of chloroplast gene expression: A systematic approach to identify and 
overcome potential limitations of foreign gene expression in chloroplasts of transgenic plants is 
essential. Information gained in this study should increase the utility of chloroplast 
transformation system by scientists interested in expressing other foreign proteins. Therefore, it 
is important to systematically analyze transcription, RNA abundance, RNA stability, rate of 
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>f the introduced HSA gene will be compared with the highly expressing 
endogenous chloroplast genes (rbcL, psbA, 16S rRNA), using run on transcription assays to 
determine if the 16SrRNA promoter is operating as expected. Transgenic chloroplast containing 
each of the three constructs with different 5' regions (see preliminary studies in section c) will be 
5 investigated to test their transcription efficiency. Similarly, transgene RNA levels will be 
monitored by northerns, dot blots and primer extension relative to endogenous rbcL, 16S rRNA 
or psbA. These results along with run on transcription assays should provide valuable 
information of RNA stability, processing, etc. With our past experience in expression of several 
foreign genes, RNA appears to be extremely stable based on northern blot analysis. However, a 

1 0 systematic study would be valuable to advance utility of this system by other scientists. Most 
importantly, the efficiency of translation will be tested in isolated chloroplasts and compared 
with the highly translated chloroplast protein (psbA). Pulse chase experiments would help assess 
if translational pausing, premature termination occurs. Evaluation of percent RNA loaded on 
polysomes or in constructs with or without STJTRs would help determine the efficiency of the 

15. ribosome binding site and 5' stem-loop translational enhancers. Codon optimized genes (IGF-I, 
IFN) will also be compared with unmodified genes to investigate the rate of translation, pausing 
and termination. In our recent experience, we observed a 200-fold difference in accumulation of 
foreign proteins due to decreases in proteolysis conferred by a putative chaperonin (3). 
Therefore, proteins from constructs expressing or not expressing the putative chaperonin (with or 

20 without ORF1+2) should provide valuable information on protein stability. Thus, all of this 
information will be used to improve the next generation of chloroplast vectors. The PI has 
extensive experience in analysis of chloroplast gene expression. 

d.2 Expression of the mature protein: HSA, Interferon and IGF-I are pre-proteins that need to 
25 be cleaved to secrete mature proteins. The codon for translation initiation is in the presequence. 
In chloroplasts, the necessity of expressing the mature protein would introduce this additional 
amino acid in cooing sequences. In order to optimize expression levels, we will first subclone the 
sequence of the mature proteins beginning with an ATG. Subsequent immunological assays in 
mice will be done with those proteins to investigate if the extra-methionine can cause 
30 immunogenic response or low bioactivity. Alternatively, we will develop systems to produce the 
mature protein. These systems can include the synthesis of a protein fused to a peptide that is 
cleaved intracellulary (processed) by chloroplast enzymes or the use of chemical or enzymatic 
cleavage after partial purification of proteins from plant cells. 
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:s that are cleaved in chioroplast: Staub et aL (9) reported chloroplast expression 
of human somatotropin similar to the native human protein by using ubiquitih fusions that were 
cleaved in the stroma by an ubiquitin protease. However, the processing efficiency ranged from 
30-80% and the cleavage site was not accurate. In order to process chloroplast expressed proteins 
5 a peptide which is cleaved in the stroma is essential. The transit peptide sequence of the 
RuBisCo (ribulose 1,5-bisphosphate carboxylase) small subunit would be an ideal choice. This 
transit peptide has been studied in depth (1 1 1). RuBisCo is one of the proteins that is synthesized 
in cytoplasm and transported postranslationally into the chloroplast in an energy dependent 
process. The transit peptide is proteolytically removed upon transport in the stroma by the 

10 stromal processing peptidase (112). There are several sequences described for different species 
(113). A transit peptide consensus sequence for the RuBisCo small subunit of vascular plants is 
published by Keegstra et al. (114). The amino acids that are proximal to the C-terminal (41-59) 
are highly conserved in the higher plant transit sequences and belong to the domain which is 
involved in enzymatic cleavage (1 1 1). The RuBisCo small subunit transit peptide has been fused 

15 with various marker proteins (114,1 15), even with animal proteins (116,117), to target proteins 
to the chloroplast. Prior to transformation studies, the cleavage efficiency and accuracy will be 
tested by in vitro translation of the fusion proteins and in organelle* import studies using intact 
chloroplasts. Once we know the correct fusion sequence for producing the mature protein, such 
sequence encoding the amino terminal portion of tobacco chloroplast transit peptide will be 

20 linked with the mature sequence of each protein. Codon composition of the tobacco RuBisCo 
small subunit transit peptide appears to be compatible with chloroplast optimal translation (see 
section 63 and table 1 on page 30). Additional transit peptide sequences for targeting and 
cleavage in the chloroplast have been described (111). If we found that the RuBisCo small 
subunit transit peptide is not suitable, other transit peptides with cleavage in stroma will be 

25 studied. The lumen of thylakoids could be a good target because thylakoids are easy to purify. It 
is relatively easy to free lumenal proteins either by sonication or with a very low triton XI 00 
concentration. However, this may require insertion of additional amino acid sequences for 
efficient import (111). 

30 Use of chemical or enzymatic cleavage: The strategy of fusing a protein to a tag with affinity 
for a certain ligand has been used extensively for more than a decade to enable affinity 
purification of recombinant products (118-120). A vast number of cleavage methods, both 
chemical and enzymatic, have been investigated for this purpose (120). Chemical cleavage 
methods have low specificity and the relatively harsh cleavage conditions can result in chemical 

35 modifications of the released products (120)* Some of the enzymatic methods offer significantly 
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3 specificities together with high efficiency, e. g. H64A subtilisin, IgA protease 
and factor Xa (1 19,120), but these enzymes have the drawback of being quite expensive. 

Trypsin, which cleaves C-terminal of basic arnino-acid residues, has been used for a long 
time to cleave fusion proteins (14,121). Despite expected low specificity, trypsin has been shown 
5 to be useful for specific cleavage of fusion proteins, leaving basic residues within folded protein 
domains uncleavaged (121). The use of trypsin only requires that the N-terrninus of the mature 
protein be accessible to the protease and that the potential internal sites are protected in the 
native conformation. Trypsin has the additional advantage of being inexpensive and readily 
available. In the case of HSA, when it was expressed in E. coli with 6 additional codons coding 

10 for a trypsin cleavage site, HSA was processed successfully into the mature protein after 
treatment with the protease. In addition, the N-tenninal sequence was found to be unique and 
identical to the sequence of natural HSA, the conversion was complete and no degradation 
products were observed (14). This in vitro maturation is selective because correctly folded 
albumin is highly resistant to trypsin cleavage at inner sites (14). This system could be tested for 

1 5 chloroplasts HSA vectors using protein expressed in E. coli. 

Staub et aL (9) demonstrated that the chloroplast methionine aminopeptidase is active and 
they found 95% of removal of the first methionine of an ATG-somatotropin protein that was 
expressed via the chloroplast genome. There are several investigations that have shown a very 
strict pattern of cleavage by this peptidase (122). Methionine is only removed when second 

20 residues are glycine, alanine, serine, cysteine, threonine, proline or valine, but if the third amino 
acid is proline the cleavage is inhibited. In the expression of our three proteins we could use this 
approach to obtain the mature protein in the case of Interferon because the penultimate 
aminoacid is cystein followed by aspartic acid. For HSA the second aminoacid is aspartic acid 
and for IGF-I glycine but it is followed by proline, so the cleavage may not be possible. 

25 For IGF-I expression, the use of the TEV protease (Gibco cat n 10127-017) would be 

ideal. The cleavage site that is recognized for this protease is Glu-Asn-Leu-Tyr-Phe-Gln-Gly and 
it cuts between Gln-Gly. This strategy would allow the release of the mature protein by 
incubation with TEV protease leaving a glycine as the first amino acid consistent with human 
mature IGF-I protein. 

30 In the E. coli Interferon-D5 expression method developed in Dr. Prieto's laboratory (see 

section C), the purification system was based on 6 Histidine-tags that bind to a nickel column 
and biotinylated thrombin to eliminate the tag on IFN-Q5. Thrombin recognizes Leu-Val-Pro- 
Arg-Gly-Ser and cuts between Arg and Gly. This would leave two extra amino acids in the 
mature protein, but antiviral activity studies have been done showing that this protein is at least 

35 as active as commercial IFN-02 (unpublished data in Dr. Prieto's laboratory). 
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d.3 Optimization of gene expression: We have reported that foreign genes are expressed 
between 3% (cry2Aa2) and 47% {cry2Aa2 operon) in transgenic chloroplasts (3,85). Based on 
the outcome of the evaluation of HSA chloroplast transgenic plants, several approaches will be 
5 used to enhance translation of the recombinant proteins. In chloroplasts, transcriptional 
regulation of gene expression is less important, although some modulations by light and 
developmental conditions are observed (123). RNA stability appears to be one among the least 
problems because of observation of excessive accumulation of foreign transcripts, at times 
1 6,966-fold higher than the highly expressing nuclear transgenic plants (124). Chloroplast gene 

10 expression is regulated to a large extent at the post-transcriptional level. For example, 5' UTRs 
are necessary for optimal translation of chloroplast mRNAs. Shine-Dalgarno (GGAGG) 
sequences as well as a stem-loop structure located 5* adjacent to the SD sequence are required 
for efficient translation, A recent study has shown that insertion of the psbA 5' UTR 
downstream of the 16S rRNA promoter enhanced translation of a foreign gene (GUS) hundred- 

15 fold (125a). Therefore, the 200-bp tobacco chloroplast DNA fragment (1680-1480) containing 
5 J psbA UTR will be used. This PCR product will be inserted downstream of the 16S rRNA 
promoter to enhance translation of the recombinant proteins. 

Yet another approach for enhancement of translation would be to optimize codon 
compositions. Since all the three proteins are translated in E. coli (see section b), it would be 

20 reasonable to expect efficient expression in chloroplasts. However, optimizing codon 
compositions to match the psbA gene could further enhance the level of translation. Although 
rbcL (RuBisCO) is the most abundant protein on earth, it is not translated as highly as the psbA 
gene due to the extremely high turnover of the psbA gene product The psbA gene is under 
stronger selection for increased translation efficiency and is the most abundant thylakoid protein. 

25 In addition, the codon usage in higher plant chloroplasts is biased towards the NNC codon of 2- 
fold degenerate groups (Le. TTC over TTT, GAC over GAT, CAC over CAT, AAC over AAT, 
ATC over ATT, ATA etc.). This is in addition to a strong bias towards T at third position of 4- 
fold degenerate groups. There is also a context effect that should be taken into consideration 
while modifying specific codons. The 2-fold degenerate sites immediately upstream from a 

30 GNN codon do not show this bias towards NNC. (TTT GGA is preferred to TTC GGA while 
TTC CGT is preferred to TTT CGT, TTC AGT to TTT AGT and TTC TCT to TTT 
TCT)(125b,126). In addition, highly expressed chloroplast genes use GNN more frequently that 
other genes. The web site http://www .lcat7usa. or.ip/codon will be used to optimize codon 
composition by comparing different species. Abundance of amino acids in chloroplasts and 

35 tRNA anticodons present in chloroplast will be taken into consideration. We also compared 
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: of all foreign genes that had been expressed in transgenic chloroplasts in our 
laboratory with the percentage of chloroplast expression. We found that higher levels of A+T 
always correlated with higji expression levels (see table 1 on page 30). It is also possible to 
modify chloroplast protease recognition sites while modifying codons, without affecting their 
5 biological functions. 

The study of the sequences of HSA, IGF-I and Interferon- □ 5 was done. The HSA 
sequence showed 57% of A+T content and 40% of the total codons matched with the psbA most 
translated codons. According to the data of table 1, we should expect good chloroplast 
expression of the HSA gene without any modifications in its codon composition. IFN-D5 has 

1 0 54% of A+T content and 40% of matching with psbA codons. The composition seems to be good 
but this protein is small (166 amino acids) and it would be easy to optimize the sequence to 
achieve A+T levels close to 65%. Finally, the analysis of the IGF-I sequence showed that the 
A+T content was 40% and only 20% of the codons are the most translated in psbA. Therefore, 
this gene needs to be optimized. Optimization of these two genes will be done using a novel PCR 

15 approach (127,128) which has been successfully used in our laboratory to optimize codon 
composition of other human proteins. 

d-4 Vector constructions: For all the constructs pLD vector will be used. This vector was 
developed in this laboratory for chloroplast transformation. It contains the 16S rRNA promoter 

20 (Prrn) driving the selectable marker gene aadA (aminoglycoside adenyl transferase conferring 
resistance to spectinomycin) followed by the psbA 3' region (the terminator from a gene coding 
for photosystem II reaction center components) from the tobacco chloroplast genome. The pLD 
vector is a universal chloroplast expression /integration vector and can be used to transform 
chloroplast genomes of several other plant species (73,86) because these flanking sequences are 

25 highly conserved among higher plants. The universal vector uses trnA and trnl genes 
(chloroplast transfer RNAs coding for Alanine and Isoleucine) from the inverted repeat region of 
the tobacco chloroplast genome as flanking sequences for homologous recombination. Because 
the universal vector integrates foreign genes within the Inverted Repeat region of the chloroplast 
genome, it should double the copy number of the transgene (from 5000 to 10,000 copies per cell 

30 in tobacco). Furthermore, it has been demonstrated that homoplasmy is achieved even in the first 
round of selection in tobacco probably because of the presence of a chloroplast origin of 
replication within the flanking sequence in the universal vector (thereby providing more 
templates for integration). Because of these and several other reasons, foreign gene expression 
was shown to be much higher when the universal vector was used instead of the tobacco specific 

35 vector (88). 
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We will design the following vectors to optimize protein expression, purification and production 
of proteins with the same amino acid composition as in human proteins. 

5 a) In order to optimize expression we will increase translation using the psbA 5*UTR (see 
section d.3) and optirnizing the codon composition for protein expression in chloroplasts 
according to criteria discussed in section d.3. The 200 bp tobacco chloroplast DNA fragment 
containing 5' psbA UTR will be amplified by PCR using tobacco chloroplast DNA as 
template. This fragment will be cloned directly in the pLD vector multiple cloning site 
1 0 (EcoRI-Ncol) downstream of the promoter and the aadA gene. The cloned sequence will be 

exactly the same as in the psbA gene. 

b) For enhancing protein stability and facilitating purification, the cry2Aa2 Bacillus 
thuringiensis operon derived putative chaperonin will be used. Expression of the cry2Aa2 

1 5 operon in chloroplasts provides a model system for hyper-expression of foreign proteins 

(46% of total soluble protein) in a folded configuration enhancing their stability and 
facilitating purification (3). This justifies inclusion of the putative chaperonin from the 
cry2Aa2 operon in one of the newly designed constructs. In this region there are two open 
reading frames (ORF1 and ORF2) and a ribosomal binding site (rbs). This sequence contains 

20 elements necessary for Cry2Aa2 crystallization which may help to crystallize the HSA, IGF-I 

and IFN-D proteins aiding in the subsequent purification. Successful crystallization of other 
proteins using mis putative chaperonin has been demonstrated (94). We will amplify the 
ORF1 and OKF2 of the Bt Cry2Aa2 operon by PCR using the complete operon as template. 
The fragment will be cloned into a PCR 2.1 vector and excised as an EcoRI-EcoRV product. 

25 This fragment will be cloned directly into the pLD vector multiple cloning site (EcoRI- 

EcoRV) downstream of the promoter and the aadA gene. 

c) To obtain proteins with the same amino acid composition as mature human proteins, we will 
first fuse all three genes (codon optimized and native sequence) with the RuBisCo small 

30 subunit transit peptide. Also other constructions will be done to allow cleavage of the protein 

after isolation from chloroplast (see section d.2). These strategies would also allow affinity 
purification of the proteins. 

The first set of constructs will include the sequence of each protein beginning with an ATG, 
35 introduced by PCR using primers. Once we achieve optimal expression levels, and if the ATG is 
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>roblem (determined by mice immunological assays), processing to get the mature 



protein will be addressed. The first attempt will be the use of the RuBisCo small subunit transit 
peptide. This transit peptide will be amplified by PCR using tobacco DNA as template and 
cloned into the PCR 2.1 vector. All genes will be fused with the transit peptide using a Mlul 
5 restriction site that will be introduced in the PCR primers for amplification of the transit peptide 
and genes coding for three proteins. The gene fusions will be inserted into the pLD vectors 
downstream of the 5'UTR or ORF1+2 using the restriction sites Ncol and EcoRV respectively. If 
use of tags or protease sequences is necessary, such sequences will be introduced by designing 
primers including these sequences and amplifying the gene with PCR. After completing vector 
1 0 constructions, all the vectors will be sequenced to confirm correct nucleotide sequence and in 
frame fusion. DNA sequencing will be done using a Perkin Elmer ABI prism 373 DNA 
sequencing system. 



Because of the similarity of protein synthetic machinery (109), expression of all 
chloroplast vectors will be first tested in E.coli before their use in tobacco transformation. For 
30 Escherichia coli expression XL-1 Blue strain will be used. E. coli will be transformed by 
standard CaCl2 transformation procedures and grown in TB culture media. Purification, 
biological and immunogenic assays will be done using E. coli expressed proteins. 

<L5 Bombardment, Regeneration and Characterization of Chloroplast Transgenic Plants: 

35 Tobacco (Nicotiana tabacwn var. Petit Havana) plants will be grown aseptically by germination 
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SO medium. This medium contains MS salts (4.3 g/liter), B5 vitamin mixture 
(myo-inositol, 100 mg/liter, thiamine-HCl, 10 mg/liter; nicotinic acid, 1 mg/liter; pyridoxine- 
HC1, 1 mg/liter), sucrose (30 g/liter) and phytagar (6 g/liter) at pH 5.8. Fully expanded, dark 
green leaves of about two month old plants will be used for bombardment 
5 Leaves will be placed abaxial side up on a Whatman No. 1 filter paper laying on the 

RMOP medium (79) in standard petri plates (100x15 mm) for bombardment. Gold (0.6 um) 
microproj ectiles will be coated with plasmid DNA (chloropiast vectors) and bombardments will 
be carried out with the biolistic device PDS1000/He (Bio-Rad) as described by Daniell (110). 
Following bombardment, petri plates will be sealed with parafilm and incubated at 24°C under 

10 12 h photoperioi Two days after bombardment, leaves will be chopped into small pieces of ~5 
mm 2 in size and placed on the selection medium (RMOP containing 500 ug/ml of spectinomycin 
dihydrochloride) with abaxial side touching the medium in deep (100x25 mm) petri plates (-10 
pieces per plate). The regenerated spectinomycin resistant shoots will be chopped into small 
pieces (~-2mm^) and subcloned into fresh deep petri plates (~5 pieces per plate) containing the 

1 5 same selection medium. Resistant shoots from the second culture cycle will be transferred to the 
rooting medium (MSO medium supplemented with EBA, 1 mg/liter and spectinomycin 
dihydrochloride, 500 mg/liter). Rooted plants will be transferred to soil and grown at 26°C 
under 16 hour photoperiod conditions for further analysis. 

20 PCR analysis of putative transformants: PCR will be done using DNA isolated from control 
and transgenic plants in order to distinguish a) true chloropiast transformants from mutants and 
b) chloropiast transformants from nuclear transformants. Primers for testing the presence of the 
aadA gene (that confers spectinomycin resistance) in transgenic plants will be landed on the 
aadA coding sequence and 16S rRNA gene. In order to test chloropiast integration of the genes, 

25 one primer will land on the aadA gene while another will land on the native chloropiast genome. 
No PCR product will be obtained with nuclear transgenic plants using this set of primers. The 
primer set will be used to test integration of the entire gene cassette without any internal deletion 
or looping out during homologous recombination. Similar strategy has been used successfully by 
us to confirm chloropiast integration of foreign genes (3,85-88). This screening is essential to 

30 eliminate mutants and nuclear transformants. In order to conduct PCR analyses in transgenic 
plants, total DNA from unbombarded and transgenic plants will be isolated as described by 
Edwards et aL (129). Chloropiast transgenic plants containing the desired gene will be moved to 
second round of selection in order to achieve homoplasmy. 

35 Southern Analysis for homoplasmy and copy number: Southern blots will be done to 
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copy number of the introduced foreign gene per cell as well as to test 
homoplasmy. There are several thousand copies of the chloroplast genome present in each plant 
cell. Therefore, when foreign genes are inserted into the chloroplast genome, it is possible that 
some of the chloroplast genomes have foreign genes integrated while others remain as the wild 
5 type (heteroplasmy). Therefore, in ordeT to ensure that only the transformed genome exists in 
cells of transgenic plants (homoplasmy), the selection process will be continued In order to 
confirai that the wild type genome does not exist at the end of the selection cycle, total DNA 
from transgenic plants should be probed with the chloroplast border (flanking) sequences (the 
trnl-tniA fragment). If wild type genomes are present (heteroplasmy), the native fragment size 

10 will be observed along with transformed genomes. Presence of a large fragment (due to insertion 
of foreign genes within the flanking sequences) and absence of the native small fragment should 
confirm homoplasmy (85,86,88). 

The copy number of the integrated gene will be determined by establishing homoplasmy 
for the transgenic chloroplast genome. Tobacco chloroplasts contain 5000-10,000 copies of their 

15 genome per cell (86). If only a fraction of the genomes are actually transformed, the copy 
number, by default, must be less than 10,000. By establishing that in the transgenics the gene 
inserted transformed genome is the only one present, one could establish that the copy number is 
5000-10,000 per cell. This is usually done by digesting the total DNA with a suitable restriction 
enzyme and probing with the flanking sequences that enable homologous recombination into the 

20 chloroplast genome. The native fragment present in the control should be absent in the 
transgenics. The absence of native fragment proves that only the transgenic chloroplast genome 
is present in the cell and there is no native, untransformed, chloroplast genome, without the 
foreign gene present This establishes the homoplasmic nature of our transformants, 
simultaneously providing us with an estimate of 5000-10,000 copies of the foreign genes per 

25 cell. 

Northern Analysis for transcript stability: Northern blots will be done to test the efficiency of 
transcription of the genes. Total RNA will be isolated from 150 mg of frozen leaves by using the 
"Rneasy Plant Total RNA Isolation Kit" (Qiagen Inc., Chatsworth, CA). RNA (10-40 jig) will 

30 be denatured by formaldehyde treatment, separated on a 1,2% agarose gel in the presence of 
formaldehyde and transferred to a nitrocellulose membrane (MSI) as described in Sambrook et 
al. (130). Probe DNA (proinsulin gene coding region) will be labeled by the random-primed . 
method (Promega) with 32 P-dCTP isotope. The blot will be pre-hybridized, hybridized and 
washed as described above for southern blot analysis. Transcript levels will be quantified by the 

35 Molecular Analyst Program using the GS-700 Imaging Densitometer (Bio-Rad, Hercules, CA). 
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Expression and quantificati n of the t tal pr tein expressed in chloroplast: Chloroplast 
expression assays will be done for each protein by Western Blot Recombinant protein Levels in 
transgenic plants will be determined using quantitative ELISA assays. A standard curve will be 
5 generated using known concentrations and serial dilutions of recombinant and native proteins. 
Different tissues will be analyzed using young, mature and old leaves against these primary 
antibodies: goat anti-HSA (Nordic Immunology), anti-IGF-I and anti-Interferon alpha (Sigma). 
Bound IgG will be measured using horseradish peroxidase-labelled anti-goat IgG. 

Inheritance of Introduced Foreign Genes: While it is unlikely that introduced DNA would 

10 move from the chloroplast genome to nuclear genome, it is possible that the gene could get 
integrated in the nuclear genome during bombardment and remain undetected in Southern 
analysis. Therefore, in initial tobacco transformants, some will be allowed to self-pollinate, 
whereas others will be used in reciprocal crosses with control tobacco (transgenics as female 
accepters and pollen donors; testing for maternal inheritance). Harvested seeds (Tl) will be 

15 germinated on media containing spectinornycin. Achievement of homoplasmy and mode of 
inheritance can be classified by looking at germination results. Homoplasmy should be indicated 
by totally green seedlings (86) while heteroplasmy is displayed by variegated leaves (lack of 
pigmentation, 83). Lack of variation in chlorophyll pigmentation among progeny should also 
underscore the absence of position effect, an artifact of nuclear transformation. Maternal 

20 inheritance will be demonstrated by sole transmission of introduced genes via seed generated on 
transgenic plants, regardless of pollen source (green seedlings on selective media). When 
transgenic pollen is used for pollination of control plants, resultant progeny would not contain 
resistance to chemical in selective media (will appear bleached; 83). Molecular analyses will 
confirm transmission and expression of introduced genes, and T2 seed will be generated from 

25 those confirmed plants by the analyses described above. 

d.6 Purification methods: The standard method of purification will employ classical 
biochemical techniques with the crystallized proteins inside the chloroplast In this case, the 
homogenates will be passed through miracloth to remove cell debris. Centrifugation at 10,000 xg 
30 would pellet all foreign proteins (3). Proteins will be soiubilized using pH, temperature gradient, 
etc. This is possible if the ORF1 and 2 of the cry2Aa2 operon (see section c) can fold and 
crystallize the recombinant proteins as expected. If there is no crystal formation, other 
purification methods will be done (classical biochemistry techniques and affinity columns with 
protease cleavage). 
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HSA: Albumin is typically administered in tens of gram quantities. At a purity level of 99.999% 
(a level considered sufficient for other recombinant protein preparations), recombinant HSA 
(rHSA) impurities on the order of one mg will still be injected into patients. So impurities from 
5 the host organism must be reduced to a minimum. Furthermore, purified rHSA must be identical 
to human HSA. Despite these stringent requirements, purification costs must be kept low. To 
purify the HSA obtained by gene manipulation, it is not appropriate to apply the conventional 
processes for purifying HSA originating in plasma as such. This is because the impurities to be 
eliminated from rHSA completely differ from those contained in the HSA originating in plasma. 
1 0 Namely, rHSA is contaminated with, for example, coloring matters characteristic to recombinant 
HSA, proteins originating in the host cells, polysaccharides, etc. In particular, it is necessary to 
sufficiently eliminate components originating in the host cells, since they are foreign matters for 
living organisms including human and can cause the problem of antigenicity. 

15 In plants two different methods of HSA purification have been done at laboratory scale. 

Sijmons et al. (23) transformed potato and tobacco plants with Agrobacterium tumefaciens. For 
the extraction and purification of HSA, 1000 g of stem and leaf tissue was homogenized in 1000 
ml cold PBS, 0.6% PVP, 0.1 mM PMSF and 1 mM EDTA. The homogenate was clarified by 
filtration, centrifiiged and the supernatant incubated for 4 h with 1.5 ml polyclonal antiHSA 

20 coupled to Reactigel spheres (Pierce Chem) in the presence of 0.5% Tween 80. The complex 
HSA-anti HSA-Reactigel was collected and washed with 5 ml 0.5% Tween 80 in PBS. HSA was 
desorbed from the reactigel complex with 2.5 ml of 0.1 M glycine pH 2.5, 10% dioxane, 
immediately followed by a buffer exchange with Sephadex G25 to 50 mM Tris pH 8. The 
sample was then loaded on a HR5/5 MonoQ anion exchange column (Pharmacia) and eluted 

25 with a linear NaCl gradient (0-350 mM NaCl in 50 mM Tris pH 8 in 20 min at Iml/min). 
Fractions containing the concentrated HSA (at 290 mM NaCl) were lyophilized and applied to a 
HR 10/30 Sepharose 6 column (Pharmacia) in PBS at 0.3 ml/mm. However, this method uses 
affinity columns (polyclonal anti-HSA) that are very expensive to scale-up. Also the protein is 
released from the column with 0.1M glycine pH 2.5 that will most probably, denature the 

30 protein. Therefore, this method will be suitably modified. 

The second method is for HSA extraction and purification from potato tubers (Dr. 
Mingo-Castel's laboratory). After grinding the tuber in phosphate buffer pH 7.4 (1 mg^2ml), the 
homogenate is filtered in miracloth and centrifuged at 14.000 rpm 15 minutes. After this step 
35 another filtration of the supernatant in 0.45 Dm filters is necessary. Then, chromatography of 
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» in FPLC using a DEAE Sepharose Fast Flow column (Amersham) is required. 
Fractions recovered are passed through an affinity column (Blue Sepharose fast flow Amersham) 
resulting in a product of high purity. HSA purification based on both methods will be 
investigated. 

5 

IGF-1: All earlier attempts to produce IGF-I in E. coli or Saccharomyces cerevisiae have 
resulted in misfolded proteins. This has made it necessary to perform additional in vitro refolding 
or extensive separation techniques in order to recover the native and biological form of the 
molecule. In addition, IGF-I has been demonstrated to possess an intrinsic thermodynamic 

10 folding problem with regard to quantitatively folding into a native disulfide-bonded 
conformation in vitro (131). Samuelsson et al. (131) and Joly et al. (132) co-expressed IGF-I 
with specific proteins of jEL coli that significantly improved the relative yields of correctly folded 
protein and consequently fecilitating purification. Samuelsson et al. (132) fused the protein to 
affinity tags based on either the IgG-binding domain (Z) from Staphylococcal protein A or the 

15 two serum albumin domains (ABP) from Streptococcal protein G (134). The fusion protein 
concept allows the IGF-I molecules to be purified by IgG or HSA affinity chromatography. We 
also could use this Z tags for protein purification including the double Z domain from S. aureus 
protein and a sequence recognized by TEV protease (see section d.2). The fusion protein will be 
incubated with an IgG column where binding via the Z domain is expected to occur. Z domain- 

20 IgG interaction is very specific and has high affinity, so contaminant proteins can be easily 
washed off the column. Incubation of the column with TEV protease will elute mature IGF-I 
from the column. TEV protease is produced in bacteria in large quantities fused to a 6 histidine 
tag that is used for TEV purification. This tag can be also used to separate IGF-I from 
contaminant TEV protease. The method could be tested easily in K coli before doing tobacco 

25 transformation. 

IFN-D: In the E. coli expression method developed in Dr. Prieto's laboratory (unpublished data) 
the purification system was based on using 6 Histidine-tags that bind to a nickel column and 
biotinylated thrombin to eliminate the tag on IFN-D 5 (see section d.2). We propose using the 
30 same method as a first attempt for purification. This method could be tested in E. coli expressed 
proteins. 

d.7 Characterization of the recombinant proteins: For the safe use of recombinant proteins as 
a replacement in any of the current applications, these proteins must be structurally equivalent 
35 and must not contain abnormal host-derived modifications. To confirm compliance with these 



WO 01/72959 



270 



PCT/US01/06288' 



1 compare human and recombinant proteins using the currently highly sensitive 
and highly resolving techniques expected by the regulatory authorities to characterize 
recombinant products (135). 

5 1- Amino acid analysis: A mino acid analysis to confirm the correct sequence will be performed 
following off-line vapour phase hydrolysis using ABI 420A amino acid derivatizer with an 
on line 13 OA phenylnnocarbamyl-arnino acid analyzer (Applied Biosystems/ABI). N- 
tenninal sequence analysis will be performed by Edman degradation using ABI 477A protein 
sequencer with an on-line 120A phenyltrnohydantoin-amino acid analyzer. Automated C- 

0 tenninal sequence analysis will use a Hewlett-Packard G 1009 A protein sequencer. To 

- confirm the C-terrninal sequence to a greater number of residues, the C-terminal tryptic 
peptide will be isolated from tryptic digests by reverse-phase HPLC. 

2- Protein fo lding and disulfide bridges formation : Western blots with reducing and non- 
reducing gels will be done to check protein folding. PAGE to visualize small proteins will be 

5 done in the presence of tricine. Protein standards (Sigma) will be loaded to compare the 
mobility of the recombinant proteins. PAGE will be performed on PhastGels (Pharmacia 
Biotech). Proteins will be blotted and then probed with goat anti-HSA, interferon alpha and 
IGF-I polyclonal antibodies. Bound IgG will be detected with horseradish peroxidase- 
labeiled anti goat IgG and visualized on X-ray film using ECL detection reagents 

0 (Amersham). 

3- Tryptic mapping : To confirm the presence of chloroplast expressed proteins with disulfide 
linkages identical to native human proteins, the samples will subjected to tryptic digestion 
followed by peptide mass mapping using matrix-assisted laser desorption ionization mass 
spectrometry (MALDI-MS). Samples will be reduced with d^thiothreitol, alkylated with 

5 iodoacetamide and then digested with trypsin comprising three additions of 1:100 

enzyme/substrate over 48h at 37°C. Subsequently tryptic peptides will be separated by 
reverse-phase HPLC on a Vydac CI 8 column. 

4- Mass analysis: Electro spray mass spectrometry (ESMS) will be performed using a VG 
Quattro electrospray mass spectrometer. Samples will be desalted prior to analysis by 

0 reverse-phase HPLC using an acetonitrile gradient containing trifluoroacetic acid. 

5- CD : Spectra will be measured in a nitrogen atmosphere using a Jasco J600 
specrropolarimeter. 

6- Chromatographic techniques: F or HSA, analytical gel-permeation HPLC will be performed 
using a TSK G3000 SWxl column. Preparative gel permeation chromatography of HSA will 

5 be performed using a Sephacryl S200 HR column. The monomer fraction, identified by 
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at 280 nm, will be dialyzed and reconcentrated to its starting concentration. For 
IGF-I, the reversed-phase chromatography the SMART system (Pharmacia Biotech) will be 
used with the mRPC C2/18 SC 2.1/10 column. 

7- Viscosity: This is a classical assay for recombinant HSA. Viscosity is a characteristic of 
5 proteins related directly to their size, shape, and conformation. The viscosities of HSA and 

recombinant HSA will be measured at 100 mg. Ml-1 in 0.15 M NaCl using a U-tube 
viscosimeter (M2 type, Poulton, Selfe and Lee Ltd, Essex, UK) at 25°C. 

8- Glvcosvlation : Chloroplast proteins are not known to be glycosylated. However there are no 
publications to confirm or refute this assumption. Therefore glycosylation will be measured 

10 using a scaled-up version of the method of Ahmed and Furth (136). 

d.8 Biological Assays: 

Since HSA does not have enzymatic activity, it is not possible to run biological assays. 
Three different techniques will be used to check IGF-I functionality. All of them are based on the 

1 5 proliferation of IGF-I responding cells. First radioactive thymidine uptake will be measured in 
3T3 fibroblasts, that express IGF-I receptor, as an estimate of DNA synthesis. Also, a human 
megakaryoblastic cell line, HU-3, will be used. As HU-3 grows in suspension, changes in cell 
number and stimulation of glucose uptake induced by IGF-I will be assayed using AlamarBlue 
or glucose consumption, respectively. AlamarBlue (Accumed International, Westlake.OH) is 

20 reduced by mitochondrial enzyme activity. The reduced form of the reagent is fluorescent and 
can be quantitatively detected, with an excitation of 530 nm and an emission of 590 nm. 
AlamarBlue will be added to the cells for 24 hours after 2 days induction with different doses of 
IGF-I and in the absence of serum. Glucose consumption by HU-3 cells will be measured using a 
coiorimetric glucose oxidase procedure provided by Sigma. HU-3 cells will be incubated in the 

25 absence of serum with different doses of IGF-I. Glucose will be added for 8 hours and glucose 
concentration will be measured in the supernatant. All three methods to measure IGF-I 
functionality are precise, accurate and dose dependent, with a linear range between 0.5 and 50 
ng/ml (137). 

30 The method to determine IFN activity will be based on their anti-viral properties. This 

procedure measures the ability of IFN to protect HeLa cells against the cytopathic effect of 
encephalomyocarditis virus (EMC). The assay will be performed in 96-well microtitre plate. 
First, HeLa cells will be seeded in the wells and allowed to grow to confluency. Then, the 
medium will be removed, replaced with medium containing IFN dilutions and incubated for 24 

35 hours. EMC virus will be added and 24 hours later the cytopathic effect will be measured. For 
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im will be removed, wells will be rinsed two times with PBS and stained with 
methyl violet dye solution. The optical density will be read at 540 nm. The values of optical 
density are proportional to the antiviral activity of IFN (138). Specific activity will be 
determined with reference to standard IFN-D (code 82/576) obtained from NIBSC. 

5 

d.9 Animal testing and Pre-CIinical Trials: 

If albumin can be produced at adequate levels in tobacco and the physicochemical 
properties of the product correspond to those of the natural protein, toxicology studies need to be 

10 done in mice. To avoid mice response to the human protein, transgenic mice carrying HSA 
genomic sequences will be used (139). After injection of none, 1, 10, 50 and 100 mg of purified 
recombinant protein, classical toxicology studies will be carried out (body weigh and food 
intake, animal behavior, piloerection, etc). Pharmaceutical companies will be contacted for 
further toxicology studies and clinical development of the product. Albumin could be tested for 

1 5 blood volume replacement after paracentesis to eliminate the fluid from the peritoneal cavity in 
patients with liver cirrhosis. It has been shown that albumin infusion after this maneuver is 
essential to preserve effective circulatory volume and renal function (140). 

IGF-1 and IFND will be tested for biological effects in vivo in animal models. Dr. 

20 Prieto's laboratory has extensive experience working with woodchucks (marmota monax) 
infected with the woodchuck hepatitis virus (WHV), widely considered as the best animal model 
of hepatitis B virus infection (141). Preliminary studies performed in Dr. Prieto's laboratory have 
shown a significant increase in 5' oligoadenylate synthase RNA levels by real time polymerase 
chain reaction (PCR) in woodchuck peripheral blood mononuclear cells upon incubation with 

25 human IFND5, a proof of the biological activity of the human IFNE15 in woodchuck cells. For in 
vivo studies, a total of 7 woodchucks chronically infected with WHV (WHV surface antigen and 
WHV-DNA positive in serum) will be used: 5 animals will be injected subcutaneously with 
500.000 units of human IFND5 (the activity of human IFNn5 will be determined as described 
previously) three times a week for 4 months; the remaining two woodchucks will be injected 

30 with placebo and used as controls. Follow-up will include weekly serological (WHV surface 
antigen and anti-WHV surface antibodies by ELISA) and virological (WHV DNA in serum by 
real time quantitative PCR) as well as monthly immunological (T-helper responses against WHV 
surface and WHV core antigens measured by interleukin 2 production from PBMC incubated 
with those proteins) studies. Finally, basal and end of treatment liver biopsies will be performed 

35 to score liver inflammation and intrahepatic WHV-DNA levels. The final goal of treatment will 
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f viral replication by WHV-DNA in serum, with secondary end points being 
histological improvement and decrease in intrahepatic WHV-DNA levels. If IFND5 proves to 
exert antiviral activity against WHV in the woodchuck, a search will be conducted for possible 
industrial partners interested in the clinical development of this product. 

5 

For IGF-1, the in vivo therapeutic efficacy will be tested in animals in situations of IGF-I 
deficiency such as liver cirrhosis in rats. Dr. Prieto's lab has published several reports (56-58) 
showing that recombinant human IGF-I has marked beneficial effects in increasing bone and 
muscle mass, improving liver function and correcting hypogonadism. Briefly, the induction 

10 protocol will be as follows: Liver cirrhosis will be induced in rats by inhalation of carbon 
tetrachloride twice a week for 1 1 weeks, with a progressively increasing exposure time from 1 to 
5 minutes per gassing session. After the 1 1 th week, animals will continue receiving CCU once a 
week (3 minutes per inhalation) to complete 30 weeks of CCU administration. During the whole 
induction period, phenobarbital (400 mg/L) will be added to drinking water. To test the 

15 therapeutic efficacy of tobacco-derived IGF-I, cirrhotic rats will receive 2 fig/100 g body 
weight/day of this compound in two divided doses, during the last 21 days of the induction 
protocol (weeks 28, 29, and 30). On day 22, animals will be sacrificed and liver and blood 
samples will be collected. The results will be compared to those obtained in cirrhotic animals 
receiving placebo instead of tobacco-derived IGF-I, and to healthy control rats.. As in the case of 

20 IFND, if plant-derived IGF-I (in addition to exerting characteristic biological effects in vitro) 
reproduces the effects of the commercial recombinant IGF-I in vivo. Pharmaceutical companies 
will be contacted for further preclinical and clinical development. IGF-I can be tested in patients 
with liver cirrhosis and poor nutritional status. 

25 Import of certain passages from Spain inactivated spelling and grammar check 

function for this file. Every effort was made to do spelling and grammar checks manually. 
Investigators apologize to reviewers for inadvertent omissions. 

30 

Tentative Proposed Schedule 



Year I: 
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combinant DNA vectors for enhanced translation of all therapeutic proteins via 
chloroplast genomes of tobacco 

b) Test protein purification and processing using chloroplasts vectors in E. coli 

c) Obtain transgenic tobacco plants using the transformation vectors 

d) Assay transgenic expression of therapeutic proteins in chloroplasts using molecular 
and biochemical methods 



Year II: 

a) Develop recombinant DNA vectors for enhanced translation of all therapeutic proteins via 
1 0 chloroplast genomes of tobacco for efficient processing 

b) Test protein purification and processing using chloroplasts vectors in E. coli 

c) Obtain transgenic tobacco plants using the transformation vectors 

d) Assay transgenic expression of processed therapeutic proteins in chloroplasts using 
molecular and biochemical methods 

15 

Year HI: 

a) Employ existing methods of purification from transgenic leaves or develop new approaches 

b) Analyze genetic composition of transgenic plants (Mendelian or maternal inheritance) 

d) Large scale purification of therapeutic proteins from green house grown transgenic plants and 
20 comparison of current purification methods with newly developed methods 

e) Animal testing, pre-clinical trials 

Year IV 

a) Refolding and characterization/comparison (yield and purity) of therapeutic proteins 
25 produced in E.coli or yeast with transgenic tobacco 

b) Animal testing, pre-clinical trials 

c) Continue to characterize subsequent transgenic generations (Tl , T2, T3). 
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EXPRESSION OF HUMAN THERAPEUTIC PROTEINS IN TRANSGENIC TOBACCO 
CHLOROPLASTS - Henry Dani II 

In the near future, demand for existing biopharmaceuticais as well as new therapeutic proteins is 
expected to rise considerably. Therefore, it is important to evaluate alternative transgenic production 
systems and ensure availability of safe biopharmaceuticais in a cost effective manner. Chloroplast 
genetic engineering promises to be one of the best available techniques since foreign gene expression 
up to 46% of total soluble protein has been demonstrated recently (study from our lab featured on the 
cover of Nature Biotechnology, January 2001). The specific aim of this proposal is to optimize 
production of therapeutic proteins such as human serum albumin (HSA), insulin like growth factor 
(IGF-I) and interferon a (IFNa) in chloroplast transgenic plants for future utilization of this system for 
biopharmaceutical production in plants. 

Chloroplast expression of Human Serum Albumin (HSA): We have already initiated transformation 
of the tobacco chloroplast genome for hyperexpression of HSA. The HSA codon composition is ideal 
for chloroplast expression and no changes in the nucleotide sequence were necessary (see page 37 
of the proposal). We designed several vectors to optimize HSA expression using different 5' regulatory 
regions. All these contain ATG as the first amino acid of the mature protein. The first vector (pLD-RBS- 
HSA) includes the chloroplast preferred Ribosome Binding Site (RBS) sequence GGAGG. In the 
second vector (pLD-5'psbA-HSA) HSA was cloned downstream of the psbA 5' UTR including the 
promoter and untransiated region, which has been shown to enhance translation. The third vector 
(pLD-Orf10rf2-HSA) introduced the putative chaperonin (Orf2) of the B.t cry2Aa2 operon upstream of 
the HSA gene, which has been shown to fold foreign proteins and form crystals, aiding in protein 
stability and purification. 

All chloroplast vectors were bombarded into tobacco leaves via particle bombardment and after 
4 weeks shoots appeared as a result of independent transformation events. All shoots were tested by 
PCR to verify integration into the chloroplast genome using the method described on page 39 of the 
proposal. The positive clones were passed through a second round of selection to achieve 
homoplasmy and transferred to pots. The phenotype of these plants was completely normal. 
Transgenic leaves analyzed by western blots showed consistently the same pattern of expression 
depending on the 5' region used in the transformation vector (see Figure 1). Maximum levels of 
expression were observed in the plants transformed with the HSA preceded by the psbA 5' UTR and 
promoter. Molecular characterization of the first generation is in progress. Southern blots of several 
clones showed homoplamy in ail transgenic lines except one (see clone # 6, Figure 2). Northern blots 
showed different length of transcripts depending on the 5' regulatory region that was inserted upstream 
of the HSA gene (see Figure 3). The most abundant transcript was the monocistron in plants with the 
5 r psbA promoter upstream of the HSA gene. Polycistrons of different length were observed based on 
the number of promoters used in each construct and differential processing. 
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bserved different levels of HSA in ELISA depending on the extraction buffer used 
and further optimization of this procedure is in progress. With incomplete extraction procedures, the 
highest HSA level of expression in plants transformed with pLD-5'psbA-HSA was up to 11.1% of total 
soluble protein; this is more than 100 fold the expression observed with other two constructs (see 
Figure 4). Because we have routinely observed high levels of foreign gene expression with other two 
vectors, we anticipate that the actual level of HSA expression in pLD-5'psbA-HSA may exceed 50% 
of total soluble protein. Since the expression of HSA under the 5'psbA control is light dependent, the 
time of the tissue harvest for expression studies is important. Such changes in HSA accumulation are 
currently being investigated using ELISA and Northerns. 

Characterization of HSA from transgenic chloroplasts for proper folding, disulfide bond 
formation and functionality is In progress. The stromal pH within chloroplasts and the presence of both 
thioredoxin and disulfide isomerase systems provide optimal conditions for proper folding and disulfide 
bond formation within folded HSA. 

Chloroplast expression of Insulin Like Growth Factor (IGF-l): From previous studies (see page 30, 
table 1) we observed that IGF-l gene coding sequence is not suitable for high levels of expression in 
chloroplasts. Therefore, we have determined the optimal chloroplast sequence and employed a 
recursive PGR method (see page 37) for total gene synthesis (see Figure 5). The newly synthesized 
gene was cloned into a PCR 2.1 vector. Insertion of zz-tev sequence upstream of IGF1 coding 
sequence (see pages 41-42) for facilitating subsequent purification is in progress. 

To demonstrate expression, purification and proper cleavage of the fusion protein we also 
cloned the full length IGF-i (including the pre-sequence) in an alphavirus vector and expressed the 
protein in human cultured cells. Alphavirus system has been used because it expresses adequate 
amounts of protein to induce a very good immune response in test animals. We observed that the 
protein had the predicted size, is properly cleaved in cells to produce the mature protein and is 
exported into the growth medium. This secreted protein could be immunoprecipitated using anti-IGF-l 
antibody. The zz-tev-IGF-l was also cloned in an alphavirus vector, expressed and labeled in human 
cultured cells. This has allowed us to see that the protein had the predicted size and as expected, is 
not secreted. To cleave zz tag after purification from chloroplasts, TEV protease is necessary (see 
page 42). Therefore, we have expressed and purified TEV protease in bacteria. After purification we 
could obtain approximately 0.5 mg. This TEV protease cleaved the labeled zz-tev-IGF-l producing two 
fragments, zz-tev and mature IGF-l. We are currently labeling more fusion protein to optimize 
conditions for TEV cleavage. 

Chloroplast expression of Interferon a5 (IFN-a5): As proposed, we have cloned human IFNaS, 
fused with a Histidine tag (for helping in further purification, see page 42) and introduced the gene into 
the chloroplast transformation vector (pLD). Western blots demonstrated expression of the IFNaS 
protein in E. coli using pLD vectors, and the maximum level was observed with the 5'psbA UTR and 
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gene was cloned into the pLD using both sequences and bombarded into tobacco 
leaves. Shoots appeared after 5 weeks and the second round of selection is in progress. 

All proposed experiments done so far have yielded results as expected. With successful hyper- 
expression of HSA observed in chloroplast transgenic plants (500-fold higher than previous reports of 
nuclear transgenic plants in the literature), we are optimistic that the transgenic chloroplasts will 
emerge as a biopharmaceutical production system in the near future. NIH funding to support of this 
proposal would make this a reality. 



» Expression of HSA via the chloroplast genome in tobacco. 
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Figure 1: Western Blot of tobacco protein extracts. A) 1: 40 ng pure HSA; 2: molecular weight marker; 3,4,6: 
untransformed plant extracts; 5: extract from plants transformed with: PLD- S'UTR-HSA; 7: pLD-OrflOr£2-HSA. B) 
1: 40 ng pure HSA; 2: molecular weight marker; 3,5: untransformed plant extracts; 4: extract from plants transformed 
with: PLD- RBS-HSA; 6: pLD-OrflOrf2-HSA. 10 micrograms of plant protein were loaded in each well. 



